</blog hiatus>
Without getting too much into details, the most frustrating thing I have to deal with at work is our general development infrastructure, the worst offender being version control.
In part and due to the peculiarities of our build environment (and not strictly due to the version control system we use), every one of our version control commands has a six-second startup penalty. Even “help”, which just prints out the list of commands, takes around 6.1 seconds. Commands that actually require server round-trips take longer, and these include things like “diff”, “status”, “edit”, “delete”, “add”, etc.
Now, if you look at the impact of a six-second penalty for each command amortized over, say, a regular week’s worth of work, it isn’t anything awful. Say you run 50 commands a day, 5 days a week. That’s only 5 minutes a day and 25 minutes for the whole week, which is certainly less time, I’d guess, then I spend getting up and going to the bathroom or getting a drink.
In other words, if you think of the impact on developers from this performance issue and you weigh it against the other things that I spend time doing each way, you would probably conclude that it would be more beneficial to developers to place a refrigerator and chamber pot in each office than it would be to think about the source control issue.
But the real cost of that six-second startup time isn’t, well, six seconds. It is much, much more.
Flow
One of my favorite parts of Peopleware is where they discuss the idea of programmer flow. Fellow geeks will know what I’m talking about: there is a state you get into, after you’ve been working for a bit, where you sort-of divorce yourself from everything but the problem at hand. Your other senses turn off, you forget things like hunger and fatigue, and hours pass without you being aware of it.
That state, which my girlfriend (not-so) affectionately calls “Noah-land”, where I seem to disappear inside myself and forget the things I should be doing instead, is the real place of productivity. I like to think of it as the place at which I’ve assembled a mental model of what I’m working on inside my head, so that what I’m really doing is changing the program in my brain and letting my hands reflect it on the computer in front of me (after all, my brain doesn’t come with a compiler).
The problem with flow, however, is that it isn’t immediate and it is easily disturbed. It takes me a good 10-15 minutes to get into the flow of things, and something especially distracting is enough to pull me right back out of it. These can be things as annoying as phone calls (that pulls me out pretty quickly), or the guys who always gather in the offices across the hall from me who have no understanding of “inside voices”.
The other disturbance is, well, human nature. The intarwebs are a pretty good source of distraction; the new email notification in outlook practically begs for me to stop what I’m working on to read whatever some manager just sent out at über-high priority.
Sometimes I can ignore those, when I’m deep in the middle of a problem, typing away at my editor.
But when I’m sitting around and waiting…well, “idle hands” and all that.
I’ve discovered that 6-10 seconds is about all it takes to break my ability to stay concentrated on a single thing.
The Real Impact
Enter my typical session:
Hmm, looks like I need to edit file foo.c. Mmmk, so command prompt, “edit foo.c”.
(A couple of seconds pass).
Hmm, I’m kinda hungry. Wonder if I should go get some food or something.
(A couple more seconds pass).
I wonder if I got any new email.
(Start reading email, get distracted on another subject)
Wonder if there is anything new on google reader…
After about 6 to 8 seconds (which ends up being on the fast end of the spectrum, as our source control gets into a rather disagreeable mood far too often), the command has finished and the file is editable, but it is too late: I’ve already moved on to some other thing.
Within a couple of minutes, I’ll make it back to my original task, but I’m certainly not in flow anymore, and it takes anywhere from 5-15 minutes to get back to it, depending on what I just got distracted by.
Let’s be conservative, then, and say the following:
- Around half the commands I run will take long enough to break my flow (so 25 commands a day)
- Every time this happens, I’ll spend 2 minutes doing something else (checking email, getting a drink, eating an orange, switching music, going to the restroom, chatting with someone online, checking the news, etc.)
- When I get back to it, I’ll have about 5 minutes of rather suboptimal productivity. By that I mean that, were I in flow, I’d probably be able to do the same amount of work in 1/10th the time. By that measurement, I’ll lose 4 minutes and 30 seconds of productive time.
Given these low estimates, then, I will lose 25 * (2 + 4.5) = 162.5 minutes or 2.7 hours, each day, due to commands that take less than 10 seconds to run. Comparing to the naïve estimate of 5 minutes a day, the actual impact is a much, much different picture. Also, taking into account that I’m not coding 8 hours each day (when we plan, we actually say that only something like 4 hours of our day will be spent really coding), this would mean that I lose about 3/4 of my development time each day thanks to a tiny little 6-8 second delay.
That number looks ridiculous, but I don’t think it is overstated, especially since these commands take anywhere from 6 seconds to about 15 minutes when the server is feeling especially objectionable, and the 15 minute delays certainly kill productivity for at least that long.
Git
To change gears a bit, earlier in the week, I re-watched the talk Linus gave at Google about Git. In it, one of the things he states is (roughly): good performance doesn’t just mean that operations will take less time. Good performance affects your workflow in much greater ways than simply adding up the time you gain back from some operation taking n% less time than it did before.
The example he is talking about is merging (which is fairly pertinent to our source control setup, where a sizable merge takes on the order of a couple of hours before it even tells you what the conflicts are), and his point is that when branching and merging become extremely cheap and easy, it changes the way you develop. You create branches for everything you work in, check in whenever you want, and merge things back in when you are done. In other words, source control becomes a useful tool rather than just a necessity.
I haven’t used git for very long or on anything very big (I’m noahrichards on github), but from what I’ve seen so far, I love it, and if linus is to believed about git performance about a year ago, git can really change the way you use source control. For those who haven’t watched the talk, he notes that he routinely merges around 22,000 files at a time, and that merge takes less than a second. Compare that to trying to “edit” a single file, which can be measured in minutes depending on if a couple of people are doing merges at the same time and bogging the server down (sadly, it only takes a few merge operations to completely kill server performance), and you can see why I’m so jealous of git performance.
Back to the situation at work, you can see how the awful infrastructure problems effect our workflow. In addition to killing productivity, we make concessions to try to avoid the problems as much as possible. We do merging (integrations) as little as possible, lots of people check in large chunks of work, rather than do things in the smallest way possible. If we need to coordinate, we end up basically sharing files (it is slightly nicer than that, but not by much), which makes things really complicated when our branch is moving underneath us.
So I suppose I don’t have a real point here, besides to whine. I guess the closest thing I can get to a point is this: if you are a developer, don’t underestimate something like this. Even an occasional action that takes an inordinate amount of time can destroy the usefulness of a piece of software. As it stands currently, I would wager that switching to something better performing at work (or at least fixing the most obvious performance issues of our infrastructure) would make us up to twice as productive as we are currently.