GiLA: Analyzing the use of issue labels in GitHub projects

Tweet about this on TwitterShare on FacebookBuffer this pageShare on RedditShare on LinkedInShare on Google+Email this to someone

As part of our research in the domain of open source projects, in the last months we’ve been studying the use of labels in issue trackers and how they can play a role in organizing project tasks and managing team communication. After some general statistics on the most frequent labels (can you guess which is by far the most common one?) we have now developed a tool that performs a more detailed analysis of labels used to tag the issues of specific GitHub projects.

Labels are a simple yet effective mechanism able to provide additional information (e.g., metadata) to project issues. A label can give any user an immediate clue about what sort of topic the issue is about, what development task the issue is related to, or what priority the issue has. Nevertheless, as it turns out, each project team uses labels in their own particular way and we would like to learn more about why is this (in case it’s the result of to a conscious decision!) and study if some labeling systems seem to work better than others. Moreover, we believe analyzing how labels are used in a project may give useful (and sometimes unexpected) information to all project members regarding the project evolution, management and organization.

To start answering all theses questions, we have developed GiLA, a visualization tool displaying the results of 3 different analysis on the use of labels in GitHub projects (and prepared a short survey, direct link or just see below, that we kindly ask you to take part in to help us advance in our understanding of this topic).


The tool generates three different kinds of label visualizations:

  • Label Usage Visualization, which allows visualizing the labels defined in the project, their frequency and how they relate to each other. arduino-V1
  • User Involvement Visualization, aimed at showing who contributes to issues tagged with a specific label (who opens new issues with that label, who closes them, comments on them).arduino-V2
  • Label Timeline Visualization, which displays the expected evolution of issues under each label, i.e., the average time to respond, average time to solve and expected resolution based on the path followed by previous issues with the same label.arduino-V3

But better than reading about it, see it for yourself!. Visit our web site, play a little bit with the tool and check the results generated for those projects you are interested in.

And please also make sure to take our survey (less than 10 minutes, I promise!) to share with us your opinion on the use of labels in open source projects and the usefulness of the visualizations we are proposing. Survey results will be openly available to everybody and published, at least, in this same website (and as soon as possible, so you won’t be waiting around for months/years, while we try to get a paper published on this, to see them, this I can also promise 🙂 )

To take the survey you can use this direct link or just reply in the embedded survey form below.


Tweet about this on TwitterShare on FacebookBuffer this pageShare on RedditShare on LinkedInShare on Google+Email this to someone


Your email address will not be published. Required fields are marked *