User Tools

Site Tools


gsoc:2021-gsoc-kernel-workflows

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
gsoc:2021-gsoc-kernel-workflows [2021/01/19 18:19]
till created
gsoc:2021-gsoc-kernel-workflows [2021/01/27 06:41] (current)
lukas.bulwahn
Line 3: Line 3:
  
 Collect ideas for GSoC student projects on Improving Kernel Workflows here. Collect ideas for GSoC student projects on Improving Kernel Workflows here.
 +
 +===== MAINTAINERS and correct integration tree information =====
 +
 +In previous work on MAINTAINERS and process conformance,​ Pia Eichinger [1] has investigated:​ are patches integrated by the maintainers defined by the responsibilities in MAINTAINERS?​
 +
 +In this project, we are interested in a related (possibly simpler) question: Are the commits integrated into the appropriate integration trees referenced in MAINTAINERS?​
 +
 +The mentor believes a main difference between considering maintainers and integration trees is that the information in MAINTAINERS about integration trees is more erroneous, as it is not used as prominently as the personal maintainer information,​ name and email, with the
 +wide-spread use of ./​scripts/​get_maintainer.pl. So, correcting those errors on integration trees in MAINTAINERS is more dominant (but also simpler) compared to correcting errors on personal maintainer information in MAINTAINERS.
 +
 +The answer on the question above can then ultimately be used to identify which integration tree entries should be added to specific sections in MAINTAINERS to match best against the actual integration observed in git.
 +
 +The factors and metric to determine what is best is of course the challenging task of identifying a suitable heuristics that is:
 +
 +  - good enough to be used to create a change to MAINTAINERS that is accepted by the community, and
 +  - simple enough to be implemented with reasonable effort.
 +
 +Background:
 +
 +The MAINTAINERS section includes references, through the T: entries, to the location of a source configuration management (SCM) tree with its type, e.g., git, quilt, hg,
 +For each commit, the kernel git history carries the commit'​s integration tree path, i.e., the information through with source configuration management (SCM) trees a commit was integrated until it was finally integrated into Linus Torvalds'​ tree.
 +
 +Ideally the references in the MAINTAINERS sections are:
 +  * complete, i.e, all integration trees used for recent kernel releases are mentioned in MAINTAINERS.
 +  * sound, i.e., the majority of the commits are integrated through the trees referenced in the MAINTAINERS sections a patch belongs to.
 +  * precise, i.e., for each MAINTAINERS section, the majority of the commits that belong to a  section are integrated through the tree referenced in that section.
 +
 +Goal:
 +
 +We identify and measure to these properties above, completeness,​ soundness and precision.
 +
 +Then, we use that information to determine which integration tree entries should be added to which specific sections to maximally increase the three properties.
 +
 +To evaluate the adequacy of this method, we can obtain feedback from the responsible kernel maintainers through proposing patches modifying the MAINTAINERS file, for the additions that we identified as most relevant (maximally increasing the properties, to a reasonable
 +threshold of number of patch proposals [to not swamp maintainers initially] and a threshold on relevance [to not send out minor changes that are largely irrelevant to the community]).
 +
 +In this project, we can make use of:
 +
 +  * gitdm at ''​git:/​ /​git.lwn.net/​gitdm.git'':​ gitdm includes some scripts to parse MAINTAINERS and obtain the integration tree patch of a commit.
 +
 +and/or
 +
 +  * [[https://​github.com/​lfd/​PaStA|pasta]]:​ Similarly to gitdm, pasta provides functionality to parse MAINTAINERS and some functionalities on extracting information on commits.
 +
 +Potential project phases:
 +
 +  - In the first phase (PoC phase), we could probably just create a setup that combines or extends the functionality in gitdm and/or in pasta.
 +  - In the second phase (MAINTAINERS patch creation phase), we send out some patches and collect feedback from maintainers.
 +  - In a third phase, with a better understanding of the individual pieces in gitdm and/or in pasta, we could then create a cleaner design that also refactors gitdm and pasta to share the same implementation when essentially the same basic functionality is used within the various analyses.
 +
 +Mentor contact: Lukas Bulwahn; lukas.bulwahn-at-gmail.com
 +
 +References:
 +
 +[1] https://​lists.elisa.tech/​g/​devel/​message/​1269
 +
  
 ===== Bidirectionally sync Patchwork patch status with Gmail labels ===== ===== Bidirectionally sync Patchwork patch status with Gmail labels =====
gsoc/2021-gsoc-kernel-workflows.1611080380.txt.gz · Last modified: 2021/01/19 18:19 by till