|
Speaker Bio
- Speaker: Greg Kroah-Hartman
- Wikipedia Entry
- Linux Kernel Hacker: SuSe Labs, Novell Inc
- Maintainer for PCI, USB, I²C, driver core and the sysfs kernel subsystems, along with contributing to the kobject, kref and debugfs code. He is also the maintainer of the linux-hotplug and udev projects. Additionally, he maintains the Gentoo Linux packages for these programs, and helps with the kernel package.
- Co-author: Linux Device Drivers, 3rd Edition, Author: Linux Kernel in a Nutshell, contributing editor for Linux Journal
- Proposed the (free) Linux Device Driver development scheme
- Talk on the same topic
Thoughts on Linux
- Development is based on a Meritocratic society system
- Linux being used from cellphones to supercomputers. from research institutes to enterprises: all w/o marketing
- Very little marketing. Even RedHat doesnt have that much money.
- there exists more linux apps than windows apps
- Other OS'es have apps for single internal stuff
- General application and software is mostly written on linux
- Support more devices and processors than other OS'es
- Scales better
The Linux Kernel Branches
Contains different main kernel "branches" (lots of different subsystem-specific kernel branches)
- main 2.6.x kernel tree
- 2.6.x.y -stable kernel tree
- 2.6.x -git kernel patches
- 2.6.x -mm kernel patches
- subsystem specific kernel trees and patches
2.6.x kernel tree
- maintained by Linus Torvalds, and can be found on kernel.org in the pub/linux/kernel/v2.6/ directory.
- As soon as a new kernel is released a two weeks window is open, during this period of time maintainers can submit big diffs to Linus, usually the patches that have already been included in the -mm kernel for a few weeks.
- The preferred way to submit big changes is using git the kernel's source management tool
- After two weeks a -rc1 kernel is released it is now possible to push only patches that do not include new features that could affect the stability of the whole kernel.
- git can be used to send patches to Linus after -rc1 is released, but the patches need to also be sent to a public mailing list for review.
- A new -rc is released whenever Linus deems the current git tree to be in a reasonably sane state adequate for testing.
- The goal is to release a new -rc kernel every week.
- Process continues until the kernel is considered "ready", the process should last around 6 weeks.
- Andrew Morton on kernel releases: "Nobody knows when a kernel will be released, because it's released according to perceived bug status, not according to a preconceived timeline."
2.6.x.y -stable kernel tree
- Kernels with 4 digit versions are -stable kernels.
- They contain relatively small and critical fixes for security problems or significant regressions discovered in a given 2.6.x kernel.
- Recommended branch for the most recent stable kernel and are not interested in helping test development/experimental versions.
- If no 2.6.x.y kernel is available, then the highest numbered 2.6.x kernel is the current stable kernel.
- "stable" team : stable@kernel.org
- Documentation/stablekernelrules.txt in the kernel tree documents how the release process works.
2.6.x -git patches
- Daily snapshots of Linus' kernel tree which are managed in a git repository (hence the name.)
- These patches are usually released daily and represent the current state of Linus' tree.
- They are more experimental than -rc kernels.
- Generated automatically without even a cursory glance to see if they are sane.
2.6.x -mm kernel patches
- Experimental kernel patches released by Andrew Morton.
- Andrew takes all of the different subsystem kernel trees and patches and mushes them together, along with a lot of patches that have been plucked from the linux-kernel mailing list.
- Serves as a proving ground for new features and patches.
- Once a patch has proved its worth in -mm for a while Andrew or the subsystem maintainer pushes it on to Linus for inclusion in mainline.
- All new patches are encouraged to get tested in the -mm tree before they are sent to Linus for inclusion in the main kernel tree.
- Not appropriate for use on systems that are supposed to be stable
- Also contain any changes in the mainline -git kernels available at the time of release.
- The -mm kernels are not released on a fixed schedule, but usually a few -mm kernels are released in between each -rc kernel (1 to 3 is common).
Linux Kernel Development Process
Structure and Hierarchy
- ~350 Maintainers, 1 Benevolent dictator
- Maintainers listed in MAINTAINERS file in kernel sources
- Individual developers (thousands) submit changes as a patch, in an official format
The Process-Pre 2.4
- 2 Versions of the Kernel, Development version maintained by Linus Torvalds, and the Stable version maintainer by an appointed Kernel maintainer (for 2.6, Andrew Morton)
- Tree of command, developers send patches to maintainers, who in turn send them to subsystem maintainers. These then send the patches to Linus for inclusion in the development tree for release.
- Kernel naming distinguishes between stable and development versions.
2.x with even 'x' is a Stable kernel, whereas an odd 'x' is a Dev kernel
The Process-Post 2.4
- All patches submitted directly to Andrew Morton who accepts everything (after testing of course)
- Effectively role reversal between Linus and Andrew. Andrew now maintains the cutting-edge Development tree (-mm tree), and Linus maintains the Stable tree.
- New release every 2.5 months!
- Heavy churn due to the new process: Some statistics...
- 2.9 patches per hour for the past 2.5 years (into Linus' tree)
- 85 lines added to the kernel every hour!
- 10% growth in code annually
- Violates traditional Software engineering norms (Do not touch Stable code)
- No indication from nomenclature. Greg and Chris Wright maintain a stable version of the 2.6 kernel, with a distinct numbering scheme (2.6.19.x) (LKML post).
Testing
- Not enough testing
- Linux Test Project
- not enough automated test scripts
- lm benches
- whats there, is a good start
- Distros beat on some versions to find bugs.
Bugs and Patches
- most bugs and problems are in drivers
- 1/3rd due to buggy hardwares.
- How do the patches go from person to person?
- All development is done through email. Developers send patches through email to other developers by sending them to different mailing lists. There is one main mailing list for all kernel development, linux-kernel. This list gets about 200-300 emails a day, and almost all aspects of the kernel are discussed on it.
Bug Reporting
- kernel bugzilla is where users are encouraged to report all bugs that they find
- bugzilla.kernel.org
- bugs talked over using mailing lists.
- Trail of Blame help identify whom to target.
- Every person who touches the patch along this chain of submission adds a "Signed-off-by:" line to their code
- Shows exactly where the change came from, and who approved it.
- This is the "trail of blame", meaning that if someone has a problem with the change, we know exactly who to blame for the issue.
Useful Tools
- Quilt
- Used to manage patches
- take any tar file: add, remove, reorder patch(es)
- Git (Wikipedia entry)
- distributed source code control. (previously used bit keeper)
- everybody has their own version.
- check things in locally.
- handles merging. (even 12 way merging!)
- doesnt work too well in windows
- easy to fork.
- Mecurial
- Source control tool.
- Works better on windows.
- Mozilla is switching to this.
- Sparse
- front end parser extension to C compiler
- find endianing issues
- runtime lock checking
Kernel vs User Space
- differentiated by system call interface
- about 260 system call interface into the kernel
- always strive to maintain this interface: all apps from linux 1.0 still run on 2.6
- Other places where things can break:
- /proc : top, ps,
- /sys : udev, HAL,
- special file systems
- alsa (Sound driver)
- video interface
- New dir is added in kernel
- documentation/ABI
- maintain list of apps that use the internal
Security fixes
- Common code is maintained among drivers as much as possible. This increases easy security fixes.
- Backporting
- Common practice amongst software vendors/organisations such as Red Hat, SuSE or Debian and is essential to ensuring that they can deploy automated updates on systems.
- Distribution developers do not follow all upstream changes once a package has become part of a released distribution
- They stick with the upstream version that they initially released and create patches based on upstream changes to fix bugs.
Getting involved
Becoming A Kernel Developer
- Newbie, Looking for beginning information?
- Look at Linux KernelNewbies project. It consists of
- A helpful mailing list where you can ask almost any type of basic kernel development question
- An IRC channel that you can use to ask questions in real-time,
- Helpful documentation that is useful for learning about Linux kernel development.
- Basic information about code organization, subsystems, and current projects
- How to compile a kernel and apply a patch etc.
- looking for some task to start doing?
- Linux Kernel Janitor's project is a great place to start.
- Describes a list of relatively simple problems that need to be cleaned up and fixed within the Linux kernel source tree.
- You'll get to working with the developers in charge of this project
- You will learn the basics of getting your patch into the Linux kernel tree,
- Possibly be pointed in the direction of what to go work on next, if you do not already have an idea.
Already have code ready?
Reference Books
Documentation
- Code is the documentation
- Tricky bits are commented well
- Linux Cross-Reference project: which is able to present source code in a self-referential, indexed webpage format.
- An excellent up-to-date repository of the kernel code may be found at: http://sosdg.org/~coywolf/lxr/
- Marketing problem: Hard to find where stuff exists
- eg: dtrace equivalent: its actually there (SystemTap)!
Legal aspects
Licensing terms
Linux is Open source under the GNU General Public License (GPL). This license allows distribution and sale of possibly modified and unmodified versions of Linux but requires that all those copies be released under the same license and be accompanied by the complete corresponding source code. Currently, Linux is licensed under version 2 of the GPL. Under this license, the onus of ensuring the legality of an individual submission, by a developer to a maintainer, is shifted onto the developer.
Trademark
Linux is a registered trademark of Linus Torvalds in the United States and some other countries.
Merits and Demerits of the Process
- Merits
- (Almost) Infinite Development resources
- Linus' Law
- Code moves with the person, unlike other companies.
- Peer review makes the code strong
- Generally good code.
- Standardized coding style
- Active maintanance of drivers is done not necessarily by people/company that wrote the code first up
- Demerits
- takes forever to remove or rip code out.
- It took 6 year to remove dev fs
- large changes are hard.
- each step has to be shown, like math class
- lot of wasted dev time to push a patch through
Interesting Facts
- Estimated redevelopment cost of kernel version 2.6.8 is about $ 1,140M (ref)
- 80% of patches was by 20 people. (2.6 had 800 odd programmers)
- 1.8 patches per hr for over two years.
- 2.9 patches an hour in the recent past. (into linus tree, 85 lines of code added every hour)
- code size growing 10% per year, with 890 devs this time.
- 50% kernel code in drivers, 25% is arch. core is 5%
Writeup authors
- Taral Joglekar, Ajay Mani Martin, Vaibhav Nivargi {taralj, ajaym, vnivargi} at cs.stanford.edu
|
|