Investigating and projecting population structures in open source software projects: A case study of projects in GitHub

Saya Onoue, Hideaki Hata, Akito Monden, Kenichi Matsumoto

Research output: Contribution to journalArticle

5 Citations (Scopus)

Abstract

GitHub is a developers' social networking service that hosts a great number of open source software (OSS) projects. Although some of the hosted projects are growing and have many developers, most projects are organized by a few developers and face difficulties in terms of sustainability. OSS projects depend mainly on volunteer developers, and attracting and retaining these volunteers are major concerns of the project stakeholders. To investigate the population structures of OSS development communities in detail and conduct software analytics to obtain actionable information, we apply a demographic approach. Demography is the scientific study of population and seeks to identify the levels and trends in the size and components of a population. This paper presents a case study, investigating the characteristics of the population structures of OSS projects on GitHub, and shows population projections generated with the well-known cohort component method. We found that there are four types of population structures in OSS development communities in terms of experiences and contributions. In addition, we projected the future population accurately using a cohort component population projection method. This method predicts a population of the next period using a survival rate calculated from past population. To the best of our knowledge, this is the first study that applied demography to the field of OSS research. Our approach addressing OSS-related problems based on demography will bring new insights, since studying population is novel in OSS research. Understanding current and future structures of OSS projects can help practitioners to monitor a project, gain awareness of what is happening, manage risks, and evaluate past decisions.

Original languageEnglish
Pages (from-to)1304-1315
Number of pages12
JournalIEICE Transactions on Information and Systems
VolumeE99D
Issue number5
DOIs
Publication statusPublished - May 1 2016

Fingerprint

Software engineering
Open source software
Sustainable development

Keywords

  • Demography
  • OSS
  • Software development communities
  • Software population pyramids

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Software
  • Artificial Intelligence
  • Hardware and Architecture
  • Computer Vision and Pattern Recognition

Cite this

Investigating and projecting population structures in open source software projects : A case study of projects in GitHub. / Onoue, Saya; Hata, Hideaki; Monden, Akito; Matsumoto, Kenichi.

In: IEICE Transactions on Information and Systems, Vol. E99D, No. 5, 01.05.2016, p. 1304-1315.

Research output: Contribution to journalArticle

@article{f5e683ca5c7f4ad698d26e4cb8e46758,
title = "Investigating and projecting population structures in open source software projects: A case study of projects in GitHub",
abstract = "GitHub is a developers' social networking service that hosts a great number of open source software (OSS) projects. Although some of the hosted projects are growing and have many developers, most projects are organized by a few developers and face difficulties in terms of sustainability. OSS projects depend mainly on volunteer developers, and attracting and retaining these volunteers are major concerns of the project stakeholders. To investigate the population structures of OSS development communities in detail and conduct software analytics to obtain actionable information, we apply a demographic approach. Demography is the scientific study of population and seeks to identify the levels and trends in the size and components of a population. This paper presents a case study, investigating the characteristics of the population structures of OSS projects on GitHub, and shows population projections generated with the well-known cohort component method. We found that there are four types of population structures in OSS development communities in terms of experiences and contributions. In addition, we projected the future population accurately using a cohort component population projection method. This method predicts a population of the next period using a survival rate calculated from past population. To the best of our knowledge, this is the first study that applied demography to the field of OSS research. Our approach addressing OSS-related problems based on demography will bring new insights, since studying population is novel in OSS research. Understanding current and future structures of OSS projects can help practitioners to monitor a project, gain awareness of what is happening, manage risks, and evaluate past decisions.",
keywords = "Demography, OSS, Software development communities, Software population pyramids",
author = "Saya Onoue and Hideaki Hata and Akito Monden and Kenichi Matsumoto",
year = "2016",
month = "5",
day = "1",
doi = "10.1587/transinf.2015EDP7363",
language = "English",
volume = "E99D",
pages = "1304--1315",
journal = "IEICE Transactions on Information and Systems",
issn = "0916-8532",
publisher = "Maruzen Co., Ltd/Maruzen Kabushikikaisha",
number = "5",

}

TY - JOUR

T1 - Investigating and projecting population structures in open source software projects

T2 - A case study of projects in GitHub

AU - Onoue, Saya

AU - Hata, Hideaki

AU - Monden, Akito

AU - Matsumoto, Kenichi

PY - 2016/5/1

Y1 - 2016/5/1

N2 - GitHub is a developers' social networking service that hosts a great number of open source software (OSS) projects. Although some of the hosted projects are growing and have many developers, most projects are organized by a few developers and face difficulties in terms of sustainability. OSS projects depend mainly on volunteer developers, and attracting and retaining these volunteers are major concerns of the project stakeholders. To investigate the population structures of OSS development communities in detail and conduct software analytics to obtain actionable information, we apply a demographic approach. Demography is the scientific study of population and seeks to identify the levels and trends in the size and components of a population. This paper presents a case study, investigating the characteristics of the population structures of OSS projects on GitHub, and shows population projections generated with the well-known cohort component method. We found that there are four types of population structures in OSS development communities in terms of experiences and contributions. In addition, we projected the future population accurately using a cohort component population projection method. This method predicts a population of the next period using a survival rate calculated from past population. To the best of our knowledge, this is the first study that applied demography to the field of OSS research. Our approach addressing OSS-related problems based on demography will bring new insights, since studying population is novel in OSS research. Understanding current and future structures of OSS projects can help practitioners to monitor a project, gain awareness of what is happening, manage risks, and evaluate past decisions.

AB - GitHub is a developers' social networking service that hosts a great number of open source software (OSS) projects. Although some of the hosted projects are growing and have many developers, most projects are organized by a few developers and face difficulties in terms of sustainability. OSS projects depend mainly on volunteer developers, and attracting and retaining these volunteers are major concerns of the project stakeholders. To investigate the population structures of OSS development communities in detail and conduct software analytics to obtain actionable information, we apply a demographic approach. Demography is the scientific study of population and seeks to identify the levels and trends in the size and components of a population. This paper presents a case study, investigating the characteristics of the population structures of OSS projects on GitHub, and shows population projections generated with the well-known cohort component method. We found that there are four types of population structures in OSS development communities in terms of experiences and contributions. In addition, we projected the future population accurately using a cohort component population projection method. This method predicts a population of the next period using a survival rate calculated from past population. To the best of our knowledge, this is the first study that applied demography to the field of OSS research. Our approach addressing OSS-related problems based on demography will bring new insights, since studying population is novel in OSS research. Understanding current and future structures of OSS projects can help practitioners to monitor a project, gain awareness of what is happening, manage risks, and evaluate past decisions.

KW - Demography

KW - OSS

KW - Software development communities

KW - Software population pyramids

UR - http://www.scopus.com/inward/record.url?scp=84970028710&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84970028710&partnerID=8YFLogxK

U2 - 10.1587/transinf.2015EDP7363

DO - 10.1587/transinf.2015EDP7363

M3 - Article

AN - SCOPUS:84970028710

VL - E99D

SP - 1304

EP - 1315

JO - IEICE Transactions on Information and Systems

JF - IEICE Transactions on Information and Systems

SN - 0916-8532

IS - 5

ER -