In association with heise online

09 August 2010, 11:37

The saga of Git: Lightning does strike twice

by Glyn Moody

Every now and then, a shiver runs through the Linux community as people realise afresh that the entire edifice has a single point of failure: Linus Torvalds. These episodes usually manifest themselves as concerns about the scalability of said individual – whether he can continue to oversee and manage the amazing distributed development model as it grows ever bigger and more ambitious. To counter those fears it is probably worth looking at what happened as a result of the first – and by far the most frightening - “Linus does not scale” episode, not least because it led to multiple positive outcomes.

Things began innocently enough, when, on 28 September 1998, somebody asked on the Linux kernel mailing list:

Am I the only one for whom 2.1.123 fbcon.c doesn't compile?

It soon became clear what the problem was, as Ted Ts'o, one of the most senior kernel hackers, pointed out gently to Linus:

I had to resubmit the patch and the explanation at least *THREE* times, over the course of 2-3 weeks, before I finally got a response out of you. And this was for a utterly uncontroversial patch!

And this was not the first time I've had to resend patches 2 or 3 times, either.

Basically, Linus was having trouble keeping up with the flow of patches.

Unfortunately, people's tempers soon started flaring, which only exacerbated the situation, until finally Linus snapped:

Quite frankly, this particular discussion (and others before it) has just made me irritable, and is ADDING pressure. Instead, I'd suggest that if you have a complaint about how I handle patches, you think about what I end up having to deal with for five minutes.

Go away, people. Or at least don't Cc me any more. I'm not interested, I'm taking a vacation, and I don't want to hear about it any more. In short, get the hell out of my mailbox.

Eric Raymond produced a good analysis of what the key issues were, but perhaps the most succinct statement came from Larry McVoy:

The problem is that Linus doesn't scale. We can't expect to see the rate of change to the kernel, which gets more complex and larger daily, continue to increase and expect Linus to keep up. But we also don't want to have Linus lose control and final say over the kernel, he's demonstrated over and over that he is good at that.

McVoy told me a couple of years after this incident how he helped to reconcile those conflicting demands:

So I had this vision that, gee, source management would solve all the world's problems, let's get these guys together. And so it took a lot of arm twisting, but all of them ended up coming for dinner one day – they were spread out, but all of us were in the Bay area. [Dave] Miller was extremely influential in making this meeting happen, because it's hard to get Linus to take time out to really focus on something unless he considers it really important. He was frustrated, and a little burned out, and Miller basically said come to this meeting or we're splitting the tree.

The threat of a fork concentrated minds wonderfully, and people tried to come up with ways of easing Linus' workload. As Miller told me:

Initially, the discussions were procedural, about what things we were doing as developers that made more work for Linus, and how we could alleviate some of that.

Some things could be improved immediately through tweaks to the submission process, but what was needed was a more thoroughgoing solution that addressed the underlying issues. As it happened, McVoy was working on something that fitted the bill perfectly: BitKeeper, a tool for managing code development on a large scale. There was just one problem: BitKeeper was not open source.

McVoy said he was willing to allow kernel hackers to use it free of charge, but for free software purists like Richard Stallman, that was no solution at all. So it's interesting that Linus decided that it *was* acceptable – a further example of his generally pragmatic approach. BitKeeper was duly adopted, and was soon helping Linus to manage the code. But then an apparently trivial incident forced him to abandon it – and produced a huge win for free software.

Since BitKeeper was closed source, the details of how it worked were not publicly available. Of course, getting around this was precisely the kind of challenge that hackers relish, and it didn't take long for Andrew Tridgell to work out the details of BitKeeper's operation. He did this in exactly the same way that he had managed to mimic the working of Microsoft's networking in his Samba program: by analysing the traffic that flowed across the network when people used the software.

It was a neat trick, but McVoy was unimpressed that somebody had effectively undermined his proprietary product, and he withdrew the free licences to the kernel community. Since Linus could hardly ask every contributor to the kernel to pay to use BitKeeper, he did what any red-blooded hacker would do: he sat down and knocked up a rough bit of code to do the same job.

That's pretty remarkable in itself, and testimony to his skills, but more so is the fact that just as with Linux kernel, this functional but rudimentary hack proved good enough to inspire others to starting working on the code. In a relatively short time, the new Git – with typical self-deprecating humour, Linus said he named it after himself – had turned into powerful application, and began to be adopted by other free software projects.

The similarities with the take-off of Linux don't end there. For as the popularity of Git began to grow, so too did the demand for professional services to support it. A new industry providing Git hosting sprang up, and has flourished mightily. Just recently, GitHub announced that the one millionth Git repository had been created on its servers.

Remarkably for a company only founded in February 2008, GitHub is “funding free and very profitable”, thanks to a business model that successfully mixes free and paid-for services:

The profit comes from the paid plans that GitHub offers for those developers and companies who want to host their repositories privately. GitHub offers essentially unlimited hosting to anyone who is willing to make their code open source, but charges based on the number of private repositories and the number of contributors for other projects. This profitability has spurred the launch of a number of new features of late, such as Organizations, which offers more advanced work flow tools for projects with multiple contributors and varying permissions, and support for fifteen new languages.

What this means is that alongside the vast, multi-billion pound industry that Linus helped to create with Linux, there is now a vibrant, if somewhat smaller one pullulating around Git. It's amazing to think that an incident that brought Linux to the brink of forking has resulted in not just a better kernel, and a fine bit of free software, but yet another powerful demonstration of how people can make money by giving stuff away. I can hardly wait to see what Linus does for his next ecosystem.

Follow me @glynmoody on Twitter or identi.ca. For other feature articles by Glyn Moody, please see the archive.

Print Version | Permalink: http://h-online.com/-1051559
  • Twitter
  • Facebook
  • submit to slashdot
  • StumbleUpon
  • submit to reddit
 


  • July's Community Calendar





The H Open

The H Security

The H Developer

The H Internet Toolkit