|
|
Subscribe / Log in / New account

On the scalability of Linus

Benefits for LWN subscribers

The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

By Jonathan Corbet
July 2, 2010
The Linux kernel development process stands out in a number of ways; one of those is the fact that there is exactly one person who can commit code to the "official" repository. There are many maintainers looking after various subsystems, but every patch they merge must eventually be accepted by Linus Torvalds if it is to get into the mainline. Linus's unique role affects the process in a number of ways; for example, as this article is being written, Linus has just returned from a vacation which resulted in nothing going into the mainline for a couple of weeks. There are more serious concerns associated with the single-committer model, though, with scalability being near the top of the list.

Some LWN readers certainly remember the 2.1.123 release in September, 1998. That kernel failed to compile due to the incomplete merging (by Linus) of some frame buffer patches. A compilation failure in a development kernel does not seem like a major crisis, but this mistake served as a flash point for anger which had been growing in the development community for a while: patches were increasingly being dropped and people were getting tired of it. At the end of a long and unpleasant discussion, Linus threw up his hands and walked away:

Quite frankly, this particular discussion (and others before it) has just made me irritable, and is ADDING pressure. Instead, I'd suggest that if you have a complaint about how I handle patches, you think about what I end up having to deal with for five minutes.

Go away, people. Or at least don't Cc me any more. I'm not interested, I'm taking a vacation, and I don't want to hear about it any more. In short, get the hell out of my mailbox.

This was, of course, the famous "Linus burnout" episode of 1998. Everything stopped for a while until Linus rested a bit, came back, and started merging patches again. Things got kind of rough again in 2000, leading to Eric Raymond's somewhat sanctimonious curse of the gifted lecture. In 2002, as the 2.5 series was getting going, frustration with dropped patches was, again, on the rise; Rob Landley's "patch penguin" proposal became the basis for yet another extended flame war on the dysfunctional nature of the kernel development process and the "Linus does not scale" problem.

Shortly thereafter, things got a whole lot smoother. There is no doubt as to what changed: the adoption of BitKeeper - for all the flame wars that it inspired - made the kernel development process work. The change to Git improved things even more; it turns out that, given the right tools, Linus scales very well indeed. In 2010, he handles a volume of patches which would have been inconceivable back then and the process as a whole is humming along smoothly.

Your editor, however, is concerned that there may be some clouds on the horizon; might there be another Linus scalability crunch point coming? In the 2.6.34 cycle, Linus established a policy of unpredictable merge window lengths - though that policy has been more talk than fact so far. For 2.6.35, quite a few developers saw the merge window end with no response to their pull requests; Linus simply decided to ignore them. The blowup over the ARM architecture was part of this, but quite a few other trees remained unpulled as well. We have not gone back to the bad old days where patches would simply disappear into the void, and perhaps Linus is just experimenting a bit with the development process to try to encourage different behavior from some maintainers. Still, silently ignoring pull requests does bring back a few memories from that time.

Too many pulls?

A typical development cycle sees more than 10,000 changes merged into the mainline. Linus does not touch most of those patches directly, though; instead, he pulls them from trees managed by subsystem maintainers. How much attention is paid to specific pull requests is not entirely clear; he does look at each closely enough to ensure that it contains what the maintainer said would be there. Some pulls are obviously subjected to closer scrutiny, while others get by with a quick glance. Still, it's clear that every pull request and every patch will require a certain amount of attention and thought before being merged.

The following table summarized mainline merging activity by Linus over the last ten kernel releases (the 2.6.35 line is through 2.6.35-rc3):

ReleasePulls Patches
MergeWinTotalDirectTotal
2.6.26159426 2881496
2.6.27153436 3391413
2.6.28150398 313954
2.6.29129418 267896
2.6.30145411 249618
2.6.31187479 300788
2.6.32185451 112789
2.6.33176444 104605
2.6.34118393 94581
2.6.35160218 38405

The two columns under "pulls" show the number of trees pulled during the merge window and during the development cycle as a whole. Note that it's possible that these numbers are too small, since "fast-forward" merges do not necessarily leave any traces in the git history. Linus does very few fast-forward merges, though, so the number of missed merges, if any, will be small.

Linus still directly commits some patches into his repository. The bulk of those come from Andrew Morton, who does not use git to push patches to Linus. In the table above, the "total" column includes changes that went by way of Andrew, while the "direct" column only counts patches that Andrew did not handle.

Some trends are obvious from this table: the number of patches going directly into the mainline has dropped significantly; almost everything goes through somebody else's tree first. What's left for Linus, at this point, is mostly release tagging, urgent fixes, and reverts. Andrew Morton remains the maintainer of last resort for much of the kernel, but, increasingly, changes are not going through his tree. Meanwhile, the number of pulls is staying roughly the same. It is interesting to think about why that might be.

Perhaps there is no need for more pulls despite the increase in the number of subsystem trees over time. Or perhaps we're approaching the natural limit of how many subsystem pull requests one talented benevolent dictator can pay attention to without burning out. After all, it stands to reason that the number of pull requests handled by Linus cannot increase without bound; if the kernel community continues to grow, there must eventually be a scalability bottleneck there. The only real question is where it might be.

[2.6.35 merge paths] If there is a legitimate concern here, then it might be worth contemplating a response before things break down again. One obvious approach would be to change the fact that almost all trees are pulled directly into the mainline; see this plot to see just how flat the structure is for 2.6.35. Subsystem maintainers who have earned sufficient trust could possibly handle more lower-level pull requests and present a result to Linus that he can merge with relatively little worry. The networking subsystem already works this way; a number of trees feed into David Miller's networking tree before being sent upward. Meanwhile, other pressures have led to the opposite thing happening with the ARM architecture: there are now several subarchitecture trees which go straight to Linus. The number of ARM pulls seems to have been a clear motivation for Linus to shut things down during the 2.6.35 merge window.

Another solution, of course, would be to empower others to push trees directly into the mainline. It's not clear that anybody is ready for such a radical change in the kernel development process, though. Ted Ts'o's 1998 warning to anybody wanting a "core team" model still bears reading nearly twelve years later.

But if Linus is to retain his central position in Linux kernel development, the community as a whole needs to ensure that the process scales and does not overwhelm him. Doing more merges below him seems like an approach that could have potential, but the word your editor has heard is that Linus is resistant to too much coalescing of trees; he wants to look stuff over on its way into the mainline. Still, there must be places where this would work. Maybe we need an overall ARM architecture tree again, and perhaps there could be a place for a tree which would collect most driver patches.

The Linux kernel and its development process have a much higher profile than they did even back in 2002. If the process were to choke again due to scalability problems at the top, the resulting mess would be played out in a very public way. While there is no danger of immediate trouble, we should not let the smoothness of the process over the last several years fool us into thinking that it cannot happen again. As with the code itself, it makes sense to think about the next level of scalability issues in the development process before they strike.

Index entries for this article
KernelDevelopment model


(Log in to post comments)

On the scalability of Linus

Posted Jul 2, 2010 22:50 UTC (Fri) by error27 (subscriber, #8346) [Link]

Maybe the limit is the number of maintainers? One person can only deal with so many people so that's the natural limit for how many maintainers you want. Each maintainer sends one merge request during the merge window and two or three after the merge window. Except for "tip" which gets merged 50 times.

It would probably help ARM to have a single maintainer. Aren't we supposed to be unifying ARM so that a single kernel can boot on everything? One solution to the config file problem would be to add an option to git to clean up the logs. "git log -p --no-merges --no-arm-config-spam v2.6.34.." The config spam far outweighs the merge spam so this would be useful. ;)

On the scalability of Linus

Posted Jul 3, 2010 2:07 UTC (Sat) by robert_s (subscriber, #42402) [Link]

"It would probably help ARM to have a single maintainer."

Perhaps this new "Linario" group could find it within their remit to employ such a person.

On the scalability of Linus

Posted Jul 12, 2010 8:50 UTC (Mon) by broonie (subscriber, #7078) [Link]

ARM does have a single maintainer, but he has a Linus style problem due to the very large number of ARM CPUs out there with wildly diverse hardware surrounding them (unlike x86 only the CPU core is standardised).

On the scalability of Linus

Posted Jul 2, 2010 23:08 UTC (Fri) by ikm (subscriber, #493) [Link]

Thanks for the article, Jon. The history recap was a very exciting read, as usual.

On the scalability of Linus

Posted Jul 2, 2010 23:19 UTC (Fri) by gerdesj (subscriber, #5446) [Link]

This "problem" is one that afflicts any organization - distribution of directorial workload.

In the case of my (UKoGB) company we have three equal directors - the Triumvirate model. It works very well for us. Two directors outvote the other. However it depends on our particular circumstances, in this case we all three are founders and share a common set of goals (I think!) Its worked pretty well for 10 years so far.

In the case of Linux, Linus is the founder and quite rightly has defaulted to being the benevolent dictator.

Linux has, of course, changed somewhat beyond a notification to comp.os.minix. The Linux kernel is an enormous collaborative effort with a huge number of stakeholders, contributers and users. Yet Mr T is still enthusiastically running the show with a pretty shrewd eye.

If you want the status quo to change then convince Linus to promote trusted people to equal partnership to distribute the load. Its his baby - he gets to choose who they are.

Either that or fork your own - good luck!

On the scalability of Linus

Posted Jul 3, 2010 7:10 UTC (Sat) by jengelh (subscriber, #33263) [Link]

I am considering that the next breakdown may be due to amount of bytes (i.e. patch lines) to be reviewed; something that no agglomeration (single patch files in 2.x -> pulls of rows of commits in git), nor any fanning of the direct pulls (i.e. turning that 100-something-ary tree in the graphic into a, say, 4-ary tree) would solve.

On the scalability of Linus

Posted Jul 3, 2010 15:58 UTC (Sat) by PO8 (guest, #41661) [Link]

So far, Linus has enjoyed remarkably good health. I pray that he will continue to do so. However, like all of us he continues to get older, and in any case people have troubles sometimes.

Imagine what an indefinite period of serious in-hospital Linus illness (starting…now) would do to the project. In a way it would be worse than if he (God forbid) was hit by a bus—in that case presumably someone would step in and try to take over his role, for better or worse. As long as Linus' future status was unclear, I can imagine much confusion about how to handle the situation.

In short, Linus' scalability when operating properly is the least of my worries.

On the scalability of Linus

Posted Jul 3, 2010 17:15 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

Probably, not that much. Andrew Morton seems to be a de-facto vice president of Linux. So he'll probably just switch to maintaining Linus' tree.

On the scalability of Linus

Posted Jul 5, 2010 13:54 UTC (Mon) by Baylink (guest, #755) [Link]

Quote of the week, no, Jon? :-)

On the scalability of Linus

Posted Jul 14, 2010 16:31 UTC (Wed) by jospoortvliet (guest, #33164) [Link]

He, yeah. I'm not sure if this is what Andrew wants... Either way, it might be time for Linus to start thinking about this and maybe train a (few) possible successor(s). It can take years before one has the experience to do what he is doing, so he should start doing that sooner rather than later...

On the scalability of Linus

Posted Jul 15, 2010 0:02 UTC (Thu) by dlang (guest, #313) [Link]

there have been several 'obvious' successors to Linus over the years. Some of them are not involved with kernel development to a noticable degree anymore, some are. it's very hard to tell how people will change in the future.

On the scalability of Linus

Posted Jul 15, 2010 7:42 UTC (Thu) by jospoortvliet (guest, #33164) [Link]

Sure, that's why I said he should train 'a few'. Anyway, there are always people coming and leaving, that's how FOSS works. And that's why you should think about succession... Not ignore that subject.

On the scalability of Linus

Posted Jul 3, 2010 18:12 UTC (Sat) by agrover (guest, #55381) [Link]

I don't see either hit-by-a-bus or slow-decline scenarios as worrisome. There have been times when other people's trees have been used widely. I think we'd see one of those get bumped up, in a pretty seamless way. git makes this easy.

Linus's tree has maintained its position as "mainline" because everyone thinks of it as such. If it stopped moving in the right direction for whatever reason then people would start using another tree as mainline in a way similar to flocking behavior in birds. There would be no "choice" as a group -- all of a sudden you'd see most other people treating tree X as mainline, and so you would too.

On the scalability of Linus

Posted Jul 3, 2010 19:34 UTC (Sat) by PO8 (guest, #41661) [Link]

You're more than likely right.

I guess my concern is about one of two possibilities; either that the usual flamage at the first big architectural decision would get resolved in some highly unsatisfactory way, or that there would never be a first big architectural decision for lack of sufficient consensus. I think most of the UNIXes I've seen have fallen to one of these two perils sooner or later. My belief is that Linus both through his solid control and through his engineering skill has been the biggest reason for Linux avoiding this so far.

It's probably just my academic SE background and lack of faith in the wonders of open source showing, but a contingency plan where "somebody will step up with something" makes me a teeny bit nervous.

On the scalability of Linus

Posted Jul 3, 2010 21:53 UTC (Sat) by lmb (subscriber, #39048) [Link]

The point is that nobody needs to step up. One can just use the tree one personally prefers - right now, the normative point is Linus, but it could just as well be someone else maintaining it.

If Linus went away, at least all the distributors would immediately start merging from the respective subsystem maintainers themselves - and over time, a new gold standard is likely to emerge, either from them or the community. Greg? Andrew?

Who knows, but the point is that Linux has a very distributed development model with no single choke point in it; the centralization is merely convenient for all, not required.

On the scalability of Linus

Posted Jul 4, 2010 4:49 UTC (Sun) by drag (guest, #31333) [Link]

Well... I know that I don't use Linus's tree on my distro. I don't use it when I am using Redhat and I don't use it when I am using Debian.

The various distributions already have taken over maintainership of their own respective kernels and have done so for a long time.

On the scalability of Linus

Posted Jul 4, 2010 20:38 UTC (Sun) by vonbrand (guest, #4458) [Link]

That isn't so. Red Hat for one used to ship extensively patched kernels, they don't do so anymore (the burden is just too high). OTOH, they do have capable people who could take over (together with the other kernel hackers, obviously) if the need should arise.

On the scalability of Linus

Posted Jul 5, 2010 10:05 UTC (Mon) by sjh (guest, #48103) [Link]

That is not entirely true. Red Hat does not ship patches that are not accepted upstream, but they do ship heavily patched kernels. I don't think Linus is still releasing 2.6.18, yet Red Hat Enterprise Linux 5 keeps growing new kernel based features (KVM being a notable example).

This has two good things. First, it increases the quality of the patches that Red Hat applies to the kernels it ships. Second, it means that Red Hat does not have to rebase all of those patches whenever they ship the next major release (RHEL6).

On the scalability of Linus

Posted Jul 8, 2010 13:31 UTC (Thu) by BenHutchings (subscriber, #37955) [Link]

That is not entirely true. Red Hat does not ship patches that are not accepted upstream,

They sometimes claim this, but it is not actually true. I have occasionally dug a bug fix out of RHEL 5 and sent it upstream.

but they do ship heavily patched kernels. I don't think Linus is still releasing 2.6.18, yet Red Hat Enterprise Linux 5 keeps growing new kernel based features (KVM being a notable example).

This has two good things. First, it increases the quality of the patches that Red Hat applies to the kernels it ships.

Still, they are backporting so far that there is plenty of opportunity to miss subtle semantic dependencies. For example, RHEL 5.4 added GRO but not the change to make TCP delayed-ACK work correctly with LRO or GRO.

On the scalability of Linus

Posted Jul 6, 2010 12:34 UTC (Tue) by RobertBrockway (guest, #48927) [Link]

There was for a long time a perception of an informal line of succession in Linux. One person who would be generally accepted as the successor. I'm not sure that is true anymore.

I really think that Linus needs to specify a formal process as to how mainline will be managed should he be unable to perform the task.

On the scalability of Linus

Posted Jul 6, 2010 23:57 UTC (Tue) by neilbrown (subscriber, #359) [Link]

I think suggesting that Linus "needs to" do anything is missing the whole point of "freedom".

Linus doesn't need to do anything, and the community doesn't need to pay any attention to what he does. But as long as what he does is useful the community will pay attention (a statement which applies equally to lots of other people).

If Linus chose to appoint a successor that might be useful ... as long as they aren't on the same bus that takes Linus away from us... But it would be no guarantee of success.

Were Linus to disappear other developers would be free to try to create their own 'central' tree. or not. Maybe several people would try. Ultimately one "winner" would emerge largely because one central tree is more useful than two (and maintaining that central tree is probably a fairly thankless job).
You might see the time with 2 (or 3 or 4) 'central' trees as wasteful, but that is also quite normally in our community. A lot of code is generated and then discarded. But this isn't a waste, it is a learning process.

A bureaucracy appoints a successor, a meritocracy allows a successor to prove themselves.

On the scalability of Linus

Posted Jul 15, 2010 12:45 UTC (Thu) by subhash11 (guest, #68935) [Link]

I think,
He should publish his maintainer-ship experience in length,
rather than putting some process that sure will end in future
in conflicting interests, power-struggle.

On the scalability of Linus

Posted Jul 9, 2010 14:08 UTC (Fri) by kabloom (guest, #59417) [Link]

If there was a power struggle in the Linux community (assuming it didn't escalate to lawsuits), then we could naturally expect the set of system calls supported by different "official" trees to diverge over time as different tree maintainers disagreed about the design of new system calls.

This would probably place the Glibc maintainer in the position of blessing a particular tree as Linus' successor.

On complexity

Posted Jul 3, 2010 23:43 UTC (Sat) by ndye (guest, #9947) [Link]

I'm happy to see Our Grumpy Editor raising this now.  Allow me to summarize a relevant book:

  • The New Plague:  Organizations and Complexity
  • William L Livingston
  • ISBN 0937063037
  • published 1985

An engineer analyzes how engineering projects too often fail simply because the project is more complex than the team, their management, and the stakeholders can understand.  He concludes that project failure is a human problem.

The project team, before considering technical issues, must ensure that they themselves, their management, and the stakeholders, all share the following traits, in merely alphabetical order:

  • open, effective communication
  • curiosity
  • integrity
  • transparency
  • and more I've left off

Without these, they have no chance.

He recommends always "taking sides" with the problem, because "solutions" change faster than the problem (and our understanding of both).

On the scalability of Linus

Posted Jul 5, 2010 2:19 UTC (Mon) by ras (subscriber, #33059) [Link]

The article provided some nice stats to show the number of pulls has been dropping over time. It would be nice to see similar stats to back up this statement:

For 2.6.35, quite a few developers saw the merge window end with no response to their pull requests; Linus simply decided to ignore them.

Without such stats it is difficult to be sure if the number of pulls has dropped because Linus was ignoring requests, or because there just isn't anything to pull.

Is there a way of measuring it?

On the scalability of Linus

Posted Jul 8, 2010 14:36 UTC (Thu) by Spudd86 (guest, #51683) [Link]

Go look at LKML

On the scalability of Linus

Posted Jul 13, 2010 8:47 UTC (Tue) by error27 (subscriber, #8346) [Link]

You could measure the number of patches in linux-next at the start of the merge window which don't get pulled by the end of the merge window. But it's not an easy thing to measure.

On the scalability of Linus

Posted Jul 18, 2010 1:53 UTC (Sun) by fredi@lwn (subscriber, #65912) [Link]

Probably because it's summer at last here in EU :)

On the scalability of Linus

Posted Jul 5, 2010 20:11 UTC (Mon) by rilder (guest, #59804) [Link]

Linux project definitely needs a failover guy(and it has many), primarily because it adds too much pressure for one person to shoulder all the way. Ofcourse Linus has other things to take care of besides Linux. :)

On the scalability of Linus

Posted Jul 6, 2010 15:48 UTC (Tue) by augustz (guest, #37348) [Link]

If the past is a reasonably good predictor of the future, than Linus performance has been good. I'd suggest there are more wins to this consistency and institutional knowledge then losses. If it ain't broke don't fix it.

If something happens to Linus, also not the end of the world as others point out. The distributors etc will all start pulling from a different tree. Witness the move to a new X tree when the old stagnated. Pretty seamless. Might not be as good, but long term would hopefully stabilize.

If the scalability bottleneck is the ARM junk, there is a real possibility that the ARM churn Linus complains about IS junk and not a scalability issue.

There is a HUGE amount of work going on below linus's tree as the article points out. This is a natural area for scalability, helped by good tooling (basically Git, written interestingly by linus). I do think eventually some deeper structure will happen. At some point Linus will simply not care about some random x. Stick it below a maintainer who does.

Having the light touch dictator keeps the kernel slightly sane, reducing barriers to entry for everyone. For software that'll probably be running in the basements of our future starfleets this can only be a good thing.

- August

On the scalability of Linus

Posted Jul 6, 2010 20:33 UTC (Tue) by dlang (guest, #313) [Link]

the ARM problem is that there is no one company that produces ARM chips. ARM is a core thta is licensed to many different companies, and each company adds additional capabilities beyond the core to the chips that they produce.

as a result the number of different chips, each with slightly different capabilities, is staggering.

up until now, each one of these chips has been treated as a different subarchitecture, each with it's own defconfig (which not only has the definitions needed to support that chip/board, but also any other defaults that the maintainer happened to select)

think of the mess that we would have if every chip released by Intel or AMD required a different architecture and you have a glimpse into the mess that is ARM

there is work ongoing to change this and instead of treating every chip as a different architecture, having a different definition file that details what peripherals and options are on each ARM chip/board and how they are hooked up instead of having that information be implicit in the architecture definition.

I think that once this is done there will be even more proliferation of ARM designs as they will be easier to support, but it will be a win-win situation as manufacturers will be able to more easily get the exact chip to fit their application and it will still be easier to support.

On the scalability of Linus

Posted Jul 7, 2010 22:33 UTC (Wed) by eduard.munteanu (guest, #66641) [Link]

I think "benevolent dictator" is an unfortunate choice of words. Neither can the development process be characterized as a democracy where there's an appointed leader. The reason I'm saying this is because taking this stuff literally might create confusion.

And by confusion I'm referring to suggestions such as "let's have more than one guy maintain mainline". This is not going to work. You need responsibility and you need to be able to point fingers when something goes wrong. It's simple: you have one guy managing his own stuff, and you trust him enough to use that for yourself.

I'm not saying a team can't manage a common repository on equal rights; this is certainly possible, don't get me wrong. But they'd better arrive at this conclusion themselves rather than being pushed into this on grounds like "bus factor". Even moreso if the "dictator" is under pressure, considered burnt out or unwilling to compromise (which can be a good thing). Committees for their own sake aren't going to do any good.


Copyright © 2010, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds