Elon Computing Sciences

Augmented Collection of Code-Level Metrics for the Debian Package Repository

Presentation at Elon Student Undergraduate Research Forum, Spring 2011

Carter Kozak (Dr. Megan Squire) Department of Computing Sciences

In this presentation, I will describe open source software and its importance, as well as a new process to collect, calculate, and distribute interesting software engineering metrics for all the packages in the standard Debian GNU/Linux installation.

Our method replicates and extends previous work done by other groups studying free and open source software systems (FLOSS) in three important ways. First, although there have been other previous studies that attempted to collect a large set of code-level metrics for a small set of projects, and there have been studies that generated a small set of metrics for the large Debian codebase, our project does both: we generate a larger set of metrics for the entire set of Debian packages. Second, our integration of new Debian metadata and additional code-level metrics not gathered before adds several additional layers for exploration. Finally, and most importantly, because we integrate our collection and analysis process into the automated FLOSSmole data store, we ensure timely, repeatable, and very easy replication and analysis by other groups.

Thus our collection activity will continue in an automated fashion, and be freely accessible to any interested research group. After outlining our process, we discuss a few observations about the data and present some opportunity for further research.