This is an old revision of the document!
<< Back to OSS Health Metrics - Working Group
Note: We use the term “indicators” on this page synonymously with “metric”. Future discussions will show which term we will continue to use.
We roughly categorize health indicators in three categories: code health, community health, and compliance health.
Disclaimer: We list and describe health indicators. By no means are we evaluating them for suitability. Open source communities have a flurry of different stakeholders and projects with each having s different interpretation of the indicators. For different situations, the indicators will carry different meanings.
Caution: The occurrence count is a rough estimate for how often we encounter the indicator. This is not an exact science.
Keep in mind, many projects do not use the GitHub issue tracker.
Keep in mind, different GitHub projects use pull requests to a greater and lesser degree.
Issue/3: Many of the indicators are also informative when tracked over time.
Interviewees often clarify, that contributions are not merely code commits, but also include documentation, issues, and community management.
We should agree on a template for the metrics.
Community health contains indicators descriptive of community interactions and behavior.
Name | Source | Description | Related Code/Queries | Occurrence |
---|---|---|---|---|
Contributor Diversity | Statistic | Ratio of contributors from a single company over all contributors Also described as: Maintainers from different companies. Diversity of contributor affiliation. This is mentioned frequently | Contributor Diversity Queries | Interviews: 3 |
Issue Response Rate | Statistic | Time between a new issue is opened and a maintainer responds Also called: bug response rate. The maintainer is believed to not “pile on” but try to solve an issue. This is mentioned frequently | Issue Response Rate Queries | Interviews: 3 |
Community Activity | Indicator | Contribution Frequency (Contribution = commit, issue, comment, …) | Interviews: 3 Issue/1 |
|
Contributor Breadth | Statistic | Ratio of non-core committers (drive-by committers) Can indicate openess to outsiders | Commits from non-core committers | Interviews: 2 |
Contribution Diversity | Ratio of code committed by contributors other than original project initiator Contributions are going up beyond the core team | Interviews: 1 | ||
Contribution Acceptance | Ratio of contributions accepted vs. closed without acceptance | Pull Request Acceptance Rate | Issue/1 | |
Bus Factor | see: community/truckFactor.md The number of developers it would need to lose to destroy its progress. Alternatively: Number of companies that would have to stop support. | Issue/1 Literature |
||
Contributors | Number of contributors | Contributors per Project | Interviews: 2 Issue/1 |
|
Contributor Activity | Activity level of individual contributors | Issue/1 | ||
Relative Activity | I sum up the activities (GH issues+comments, GH pull requests+comments and GH commits) for the project members and for the non-project members, then I create a ratio of the two. Compare the activity between committers-as-a-group and contributors-as-a-group. It easily shows when a project is not yet popular, or when a project is not paying attention to its users. I also feel that a balance between the two groups is essential; ie) a project with a lot more contributor than committer activity is one that is failing to 'recruit' committers quickly enough. | Mailing list | ||
Distribution of Work | How much recent activity is distributed? | Issue/1 | ||
Contribution Age | Time since last contribution Gives a sense of how active the community is. (Contribution = commit, issue, comment, …) | Interviews: 1 | ||
Forks | Number of forks | Forks Query | Interviews: 2 | |
Stars | Number of stars | Interviews: 2 | ||
Watchers | Number of watchers | Watchers Query | Interviews: 2 | |
Issues Open | Number of open issues | Open Issues | Issue/1 | |
Issues submitted/closed | Issues submitted vs. issues closed Example | Issues Submitted vs Closed | Interviews: 2 | |
Issue Comments | Number of Comments per Issue | Issue Comments | Issue/3 | |
Time to Contributor | Time to becoming a contributor | Interviews: 1 Issue/1 |
||
Path to Leadership | A communicated path from lurker to contributor to maintainer. (or. track members: time from user to maintainer/leader) Rational: If active contributors are not included in leadership decisions they might lose interest and leave. (Focus on least likely contributor) | Interviews: 2 LFOSLS |
||
Blogposts | Number of blogposts that mention the project | LFOSLS | ||
YouTube Videos | Number of Youtube videos that mention or specifically deal with the project (e.g. tutorials) | LFOSLS | ||
Job Postings | Number of job postings that mention the project as a preferred or required skill | LFOSLS | ||
Downloads | Number of downloads ! beware: downloads might be skewed by builders Used as measure for 'success' (Grewal, Lilien, & Mallapragada, 2006) | LFOSLS (Grewal, Lilien, & Mallapragada, 2006) |
||
Reopened issues | Rate of issues closed but discussion continues or issues that were closed and re-opened | LFOSLS | ||
Release Velocity | Time between releases Regular releases are a reliability metric | LFOSLS | ||
Release Maturity | Ratio of major and minor releases | LFOSLS | ||
Decision Distribution | Central vs. distributed decision making Governance model, scalability of community | LFOSLS | ||
Transparency | Number of comments per issue Discussion is occuring openly - could also indicate level of agreement | LFOSLS | ||
Roadmap | Existence and quality of roadmap Best Practice: community engagement and scalability (might not be automatically computable) | |||
Gatherings | Number of face-to-face/in-person meetings per year Resets contentious issues; Resolve tensions; Avoid longstanding grudges | LFOSLS | ||
Role Definitions | Existence and quality of role definitions Governance related. Relates to “Path do Leadership” | LFOSLS | ||
Rewards | Rewards, shout-outs, recognition, and mentions in pull-requests or change logs - might improve contribution levels | LFOSLS | ||
Retrospectives | Existence of after release meetings Collect lessons learned, improve processes, recognize contributors | LFOSLS | ||
Onion Layers | Distance between onion model layers (users, contributors, committers, and steering committee) Rule of thumb: factor of 10x between layers. (Node.js keynote) | LFOSLS | ||
Release Note Completeness | Number of functionality changes and bug fixes represented in release notes vs. release. Good for users, also shows diligence of community | LFOSLS | ||
Unity | Rivalry or unity of community (sentiment analysis?) | LFOSLS | ||
Use of Acronym | Frequency of acronyms used Specialized language can be a barrier for new contributors. | LFOSLS | ||
Language Bias | Diversity metric: Bias against gender, ethnicity, … in use of language (maybe use sentiment analysis) | LFOSLS | ||
Commit Bias | Diversity metric: acceptance rate (and time to acceptance) differences per gender, ethnicity, etc… | LFOSLS | ||
Stack Overflow | Several metrics: # of questions asked, response rate, number of responding people that have verified solutions | LFOSLS | ||
Non-Source Contributions | Track contributions like running tests in test environment, writing blog posts, producing videos, giving talks, etc… | LFOSLS | ||
Maturity Label | Community assigned label Some communities label projects as incubator, mature, (or something) | LFOSLS | ||
User Groups | user groups perform a variety of crucial marketing, service support, and business-development functions at the grassroots level | (Bagozzi & Dholakia, 2006) | ||
Age of Community | Time since repository/organization was registered; or Time since first release. “Results showed that the age of the project played a marginally significant role in attracting active users, but not developers. We attribute this differential effect of age on users and developers to the fact that age may be seen as an indicator of application maturity by users, and hence taken as a positive signal, whereas it may convey more ambiguous signals to developers.” (Chengalur-Smith et al., 2010, p.674) | (Chengalur-Smith, Sidorova, & Daniel, 2010; Grewal, Lilien, & Mallapragada, 2006) |
Code health contains indicators descriptive of a code base and its quality.
Name | Source | Description | Related Code/Queries | Occurrence |
---|---|---|---|---|
Pull Request made/closed | Pull requests made vs. pull requests closed Example Encompasses number of pull requests rejected (Issue/1) | Pull Requests Made vs Closed | Interviews: 3 | |
Pull Requests Open | Number of open pull requests Might be more telling than total pull requests | Pull Requests Open | Interviews: 1 Issue/1 |
|
Pull Request Comments | Number of comments per pull request | Pull Request Comments | Interviews: 1 | |
Pull Request Discussion Diversity | Number of different people discussing each pull request | Pull Discussion Diversity | Interviews: 1 | |
Update Rate | Number of updates over period x | Issue/1 | ||
Update Regularity | How consistently and frequently are updates provided. | Interviews: 1 Issue/1 |
||
Update Age | Time since last update | Interviews: 1 Issue/1 |
||
Repository Size | Overall size of the repository or number of commits | Total Commits | Issue/1 | |
Size of Code Base | Lines of code | Mailing list | ||
Bugs after Release | Number of bugs reported after a release | LFOSLS | ||
Code Modularity | Modular code allows parallel development, which Linus Torvalds drove for Linux | Linus Torvalds at LFOSLS (Baldwin & Clark, 2006) |
Compliance health contains indicators informative of vulnerabilities and license obligations.
Name | Source | Description | Related Code/Queries | Occurrence |
---|---|---|---|---|
Test Coverage | Interviews: 1 | |||
Bug Age | Age of known bugs in issue tracker Use label for determining bugs? | Issue/1 | ||
Known Vulnerabilities | Number of reported vulnerabilities Could be limited to issue-tracker or extended vulnerability databases (e.g. CVE) | Interviews: 1 Issue/1 |
||
Dependency Depth | Number of projects included in code base + number of projects relying on focal project (recursive) Indicator about centrality in open source Dependency network | Interviews: 1 | ||
License Declared | What license does the project declare | Issue/1 | ||
License Conflict | Does the project contain incompatible licenses | |||
All Licenses | List of licenses | |||
License Count | Number of licenses | |||
License Coverage | Number of files with a file notice (copyright notice + license notice) |
This includes reasons why metrics are considered for other reasons This section collects notes on what possible goals might be.
We have heard other classifications that we simply list here.
Ideas for these classifications is to 1. generate a uniform classification and through conversations merge the different classifications. 2. create mappings of the indicators to the different classifications