To Mob, Pair, or Fly Solo

“...If asked, most programmers would probably say they preferred to work alone in a place where they wouldn’t be disturbed by other people. The ideas expressed in the preceding paragraph are possibly the most formidable barrier to improved programming that we shall encounter.”
– Gerry Weinberg, The Psychology of Computer Programming, 1971.

Introduction

I first started hearing about pair programming around the mid 2000’s and became more aware of it after joining the ACCU around the same time. However it was not something that immediately made sense to me, after all, by then I had been programming professionally for well over a decade and unprofessionally for another decade before that. By then I felt I could probably describe myself as a fairly experienced programmer, not necessarily an expert [1], but definitely no longer a junior.

As I sat in the bar very late one night (or more likely one early morning) at the Barcelo Hotel in Oxford (where the ACCU Conference was being held back then) I quizzed my fellow ACCU colleagues what use they thought pair programming had for the more experienced programmer. I could see that the practice would have benefits at the more junior end of the spectrum but I couldn’t really grasp what was in it for those of us with a few more miles on the clock.

A decade later I’m finally beginning to formulate an answer and, probably unsurprisingly, it’s not at all what I thought it was about…

Heavy Scepticism

Many people, when they hear about any kind of new practice that they don’t really understand, fill in the gaps by making assumptions and then often come to the wrong conclusions and therefore potentially dismiss it out of hand. Even if they’re willing to give it a jolly good go, unless they really know what problems it aims to solve they can evaluate it poorly and then still dismiss it as useless (to them).

Certainly for the first couple of years after first hearing about pairing I fell into this trap. I generally worked in environments where the other team members were all very experienced, the systems were generally pretty mature and the culture was (as I perceived) focused on individual performance rather than what the team as a whole was producing. Any “pairing” that was done was of the more traditional sense, such as talking through ideas on a whiteboard, jumping in to help out when a production incident occurred or helping a teammate debug an issue. In the latter case the “pairing” usually ended once the issue was identified; there was no continued collaboration to see the entire task right through to the end together or to go beyond the immediate need for learning.  (In retrospect the goal seemed to be to minimise disruption of the more experience programmers rather than maximise the transfer of knowledge to the less experienced ones.)

Naturally my perception, based on very little knowledge, was really a misconception as I assumed the point was to be more productive, where the term “productive” could be interpreted very loosely as writing more lines of code per day. The mental model I was working with was that of a Symmetric Multiprocessor Architecture (SMP) – still somewhat expensive even back then – where adding each additional person was like adding another CPU. Everyone is probably aware of the old adage “two heads are better than one” and if you equate programming to a largely cerebral activity I don’t think this is an altogether unexpected hypothesis to derive from the smattering of (dis)information floating around at the time.

Consequently knowing that an extra CPU in a SMP architecture generally only adds another 60% of a full CPU’s performance (a rule of thumb back then due to communication overhead) I was not convinced that the “whole” could ever be greater than the sum of its parts, at least not for a pair of seasoned programmers. I was definitely open to the idea that for more junior programmers there was a lot to be gained because you have so much to learn when you’re starting out about the tools, processes and problem domain.

And what about mob programming – the idea that if two heads are better than one, why shouldn’t three, ten or a hundred be even better? Now things are starting to get crazy and are verging on the faux-science peddled by homoeopathists! Applying the multi-processor analogy once again there are clearly diminishing returns here and there is simply no way ten people working on ten problems sequentially can be as effective as ten people working on ten different problems concurrently?

Something you might have noticed when you do work together with other people while tackling a problem where there is a time pressure element, such as a production incident, you tend to pick up every little mistake the person doing the typing, talking, drawing, etc. makes. It’s almost as if every single keystroke matters and one of the benefits you can provide is the ability to catch any mistakes before they’ve even hit the Enter key. But surely it’s not just about saving keystrokes either?

Collaboration Benefits

While possibly useful, correcting typing mistakes is not our modus operandi or a major selling point for pairing or mobbing – we generally have other tools, like compilers and IDEs, which are far more reliable at that sort of thing. No, while there are some short term benefits, the emphasis is playing the long game which most software development falls into.

Shared Understanding

Those earlier opinions of mine, I now realise, are the thoughts of a brain addled by too many years working in the enterprise where a fundamental assumption is that overall efficiency is high if every single person is working to capacity. The flaw in the assumption is that what really matters is how much work each person does per day. One thing this naïve view fails to take into account is the lost opportunities due to delivering everything at the end instead of when ready. But it also misses the lost productivity due to the extra overhead of handovers and context switches as everyone is too busy to help each other out [2]. (This is one area where my multi-processor analogy does appear to hold some water.)

While these are very real concerns what I began to realise was that the real benefits of multi-programmer collaboration are not about short term wins but about long term sustained delivery. What slows a team down over time is the rise in complexity as the product gains more and more functionality. A single programmer can write and maintain an implementation of Fizz Buzz but once we’re talking in the thousands of lines of code the complexity becomes much harder to manage. Over time as people, tools and concepts come-and-go the landscape changes and without a common understanding of where we were, where we are now and where we’re heading that rate of divergence accelerates and becomes ever harder to correct. Before long the code-base resembles Frankenstein’s monster and the cognitive load required to implement even small changes safely becomes insurmountable.

This isn’t just an issue for someone new to a code-base either, although this is definitely one area where the practice of pairing shines because it allows the new joiner to make contributions to the team from day one. Instead of waiting a few weeks (or months) for someone to get up to speed with the code-base and keeping their work quarantined while it’s reviewed and merged they can use their existing knowledge and skills of the technical domain from the get-go to suggest solutions and improvements while their longer-serving partner(s) provide the conceptual integrity [3]. The knowledge can then be drip-fed real-time and prioritised based on what you actually need to get the job done day-to-day.

At the more trivial end of the spectrum are the small things that we generally shouldn’t sweat [4], like coding guidelines. What really matters are the design principles that may not always be quite so obvious from just browsing through the code. For example parsing the response from an HTTP endpoint may be done more laboriously than is otherwise necessary because certain types of errors trigger slightly different error recovery behaviours. Similarly certain conventions may be used because they enable legacy code to be more easily tested such as the use of “internal” instead of “private” in C#.

Obviously documentation and tests can provide additional context but they are rarely used to express the kinds of decisions just described. Architecture Decision Records [5] are a great way to cover big picture stuff but smaller patterns or idioms, e.g. Value Object and Optional<T>, would probably be considered out of scope and may just assumed to be common knowledge. (In my experience these are anything but common.) With nobody and nothing to guide you, you’ll have to use a spot of software archaeology [6] to work out where on the timeline you are and what you should be refactoring towards, which all takes up time and does not provide a definitive answer.

Shared Ownership

It isn’t just the understanding of the products’ architecture and design that needs sharing but also the ownership of the source code itself and therefore the right for anyone to change it. In the past when single people worked on specific features the code they wrote to implement it would, whether consciously or not, be associated with that person. They naturally became the expert and therefore the gatekeeper of that module and therefore future changes to it would likely be funnelled through that person too – it’s the most efficient approach, right?

Conway’s Law teaches us that the structure of our code will reflect the structure of the communication paths in the organisation and so if the paths are blocked we will route around them which in code terms might mean duplication for example or inappropriate use of inheritance as a way of side-stepping the “problem”. The Open/Closed Principle was intended as a technique for extensibility of a design not a way to get around bug fixes needed in the existing code.

When every line of code stems from the contributions of more than one person any notion of single ownership vanishes as it’s clearly nonsensical to attribute the code to the person who happens to be on the check-in or the one at the keyboard at the time. When everyone gets involved in pretty much everything the chances of “silos” developing are minimal. (Eric Raymond [7] opines that a non-territorial approach to software development, which Weinberg much earlier termed “egoless programming”, is likely a significant factor in the speed and quality of open source projects.)

Learning

For those already largely au fait with the organisation’s product and processes there is still a lot to be gained from collaborating with both their peers and juniors alike. Although we may try really hard not to utter those infamous words “we’ve always done it this way” there is nothing like a mind unhindered by decades of baggage to challenge why you’re not doing something in a different way.

There is simply too much going on in the world of software development to keep up with all the new languages, tools, products, etc. Much as I try to flick through the release notes for the latest update to my favourite editor I quickly forget some of the little shortcuts and I need to be reminded to get it into my muscle memory. Just watching someone else navigate around the code-base can be enlightening both about the tool you’re using and the code-base itself. I once remember watching some teammates continually build the entire solution in Visual Studio 7 instead of installing the Fast Solution Build plugin that ensured you only built what was changed, thereby saving a fair amount of time. Just telling people these tips via email or IM isn’t enough, especially if it’s in the middle of something important; they likely already have enough yaks to shave.

Learning of course means more than just what keys to press or tools to use, useful though they are. We also need earlier feedback on the way we express ourselves in code. There is a humorous take on code reviews which regularly does the rounds on Twitter that goes:

“Code reviews: 10 lines of code = 10 issues, 500 lines of code = ‘looks fine’.”

By the time the code is written there is already a strong desire to push it through and get it out there rather than hold it up until it’s corrected. Like it or not we often have an emotional investment which makes us reluctant to change too much – there are sunk costs. The best time to provide feedback is the moment it happens as you’re associating it with the trigger which makes the chances of the more desirable behaviour occurring much stronger in future. It also means spending less time on formal reviewing and rework and therefore faster delivery in the long run (for the same level of quality).

Wisdom of Crowds

One of the quotes (from Émile Chartier) that I was introduced to by Kevlin Henney is:

“Nothing is more dangerous than an idea, when it's the only one we have.”

While I would like to think that I am capable of having multiple ideas for every problem, it simply isn’t true. Even at the lowest level there are many little decisions we make every day about the names of classes, interfaces, methods and variables. It’s not just their names either; even the decision of how to partition the logic – inline vs free functions vs classes etc. – is something which different programmers would take a view on.

Hal Abelson famously said that programs must be written for people to read and only incidentally for machines to execute. What you or I might consider readable however may not be so obvious to a future maintainer. It’s not just the names of things here, the layout and structure matter too, and I don’t mean tabs or spaces, I mean how the code is organised within namespaces and assemblies and then how those are arranged on disk within the solution.

A good example of the wisdom of crowds was brought home to me once during a mobbing session. We were doing some refactoring after adding a new feature and a bunch of tests started failing. It wasn’t immediately obvious what was going on even during a quick debug of one of the broken tests. I had a good idea where the problem was but one of my colleagues suggested we go looking somewhere completely differently. The crowd agreed with them instead of me, which is lucky, because we got to the root cause pretty quickly. If we had gone down my route we could easily have lost a couple of hours searching in the wrong place. Even the intuition of very experienced programmers can be wildly wrong.

It’s Fun!

Five years ago I had done virtually no “production” pairing at all. I knew I was going to enjoy it though after attending a meet-up hosted by Jon Jagger [8]. Writing code in a group where we didn’t know each other and had different backgrounds and skills was both a scary and exhilarating experience. Taking it to such an extreme meant that the diversity aspect of the group really shone through along with the social nature. (You could argue that I did a fair bit of pairing in my teens when I went round my mate’s house as naturally we only had one computer back then, but I wouldn’t classify that period as one of “egoless programming” [9].)

Not long after, I ended my current contract and jumped ship to a company where pairing was often the norm, even the main interview was an exercise that involved pairing on a simple kata (which makes a lot of sense). Since then the majority of my professional time has been spent programming in groups of 2 or more, mostly as a pair, and now when I do find myself working on my own I really miss having that input from other people. Much as I enjoy listening to music I find a spot of banter when working can be equally enjoyable and often lead to tangential conversations that are actually relevant.

In a recent Afterwood [10] I questioned why there weren’t more programming partnerships along the lines we traditionally see in writing partnerships such as for sitcoms. Over my 25 years as a professional programmer there are a handful or so of people that I’ve worked closely with for a significant amount of time that I’d work with again without question because I felt we complemented each other in ways that meant we did great work together. We don’t necessarily share the same taste in music, films, books, food, text editor, brace placement, etc. which I’d suggest is A Good Thing as it ensures there is always a topic up for “a healthy debate” when the code itself is lacking contention. Maybe we’re just not ready yet to settle down and pair with just one partner…

This continual need to justify what we write is what I believe makes it a fun and productive technique. The aim is produce the simplest solution that can solve the problem at hand which means that nearly everything becomes a negotiation – you can’t do something just because you feel like it. This causes us to continually reflect on what we’re doing and therefore helps to ensure our decisions are conscious ones. The outcome is invariably more satisfying because the journey has been more adventurous.

One colleague of mine has suggested making a set of “Programming Luminary” style Top Trump cards for those moments when you need to lighten the mood and back up your argument with “a big hitter”. For example you might decide your side of the argument needs a quote from “Martin Fowler” whereas your partner, nay opponent, might choose to counter with something from “Uncle Bob”. The point being that there are many notable people whose ideas shape our vocabulary and thinking but we need to avoid falling into the trap of programming dogma.

How Many Cooks?

With all that said, not every task requires a legion of programmers working on it and quite frankly it can be an intense experience. If you’ve ever done any full day workshops you’ll know how draining it can be (assuming it was one you enjoyed and actively participated in). There are calls out there to do more formal research into where the tipping points are but the sections below give my entirely unscientific approach to when I’ve found each of these approaches beneficial.

Mobbing

The (somewhat unfortunate) term “mob programming” is a relatively new one and it’s something which is gaining more ground as working in pairs may just not be valuable enough in some circumstances. I’ve heard about teams that work entirely in mobs, some even as a “free flowing mob” where people dip in and out of the mob during the day. Where this has happened to me by accident I’ve not noticed any significant side-effects from the coming and going.

Personally I’ve only worked in a large mob a dozen or so times, mostly where the whole team was engaged on a single problem. (We generally commandeered a meeting room for the entire day so we could all comfortably share one laptop projected onto a large screen.) The driver for forming the mob has always been “shared understanding”. In one instance it was setting up the build pipeline for a new project where we all had slightly different skills but wanted to formulate the pipeline together as a group. We needed to choose some tools and approaches and rather than keep stopping and starting to discuss options we decided to just work on the task together and trash out the initial skeleton over a number of days. This helped set the scene for the eventual pairing afterwards as we all had a better idea of how we wanted to work.

Other times when programming as a team has proved useful has been when you need to tackle a new behaviour within a system which introduces a fundamental design principle. For example when working with a document oriented database that lacks atomicity, except at the document level, you need to be careful how you handle multi-document changes. It is essential that all developers in the team understand these core concepts if changes to the system in the persistence area are to be done safely in the future as proving correctness or testing for race conditions is hard. Security is another aspect of system design that needs to be permeated across the entire team and doing it as a group has proved well worthwhile in establishing common patterns that are easy to follow.

Naturally there are some problems with programming in a large group, although I will hasten to add that the problems I’ve experienced are usually outside the team. The long term benefits of such an approach are still under scrutiny and seeing the entire team working on a single problem can make some managers very nervous. Whenever we did it the board only showed a single item of work in progress and someone would always ask “when do you think we’ll be able to start working in parallel again?” Explaining the consequences of everyone not being on the same page for critical design decisions certainly goes a long way, but you may still be left repeating yourself as you make the case for the apparent (short-term) drop in productivity.

Pairing

While the whole team working together on a new epic is useful for setting the tone I’ve found it less useful once you start rattling through the individual features. (Oren Eini uses this notion of Concepts & Features [11] where concepts are the “framework” which underpin the individual features.) For example when you start adding authentication and authorisation to an API you might pick an initial endpoint and work as a mob on to iron out the basic design and then switch to pairs to apply it across the rest of the API. The first part is pure design work which cannot be parallelised whereas the implementation likely can be because most of the difficult questions should have already been addressed.

While you can go all out and spread the load right across the team you are only delaying delivery of that work by storing up the code reviews and rework for later, and forcing each other to context switch to pick it up. It’s not uncommon to see a “review” column added to the task board with an inevitable WIP limit added to slow the team down and ensure that they are critiquing each other’s work. Pairing largely takes that burden away as by working together you are naturally reviewing and reworking as you go along thereby keeping complexity under control and avoiding sunken costs.

It’s all too easy when working alone to get carried away, either with a spot of refactoring or scope creep, and so having that Jiminy Cricket working with you keeps you honest all the time. The converse is also true though and they are there to ensure that the work is carried through to completion – that the TDD states of red and green really are followed by any necessary refactoring.

Just to be perfectly clear they are not there to be your nanny or lackey but an equal partner in the delivery of a story no matter how much or little experience either of you has.

Going Solo

Given what’s been said about the usefulness of working with one or more people it’s questionable whether there is every really a time when working by yourself is ever appropriate? I think there is.

There are some tasks which really don’t demand the attention to detail that product features do because the risks are so low or they are less valuable and being done for other reasons. For example updating background documentation or applying a simple fix to a script that came up during some investigation can easily be done by oneself. Working as a group can be intense and so one way to take a break and grab a little space is to go and do something else alone. (Of course if you work in an open plan office you’re rarely entirely alone.)

We all have a variety of administrative duties to perform during the week as well as catching up on any emails, background conversations on IM, timesheets to complete, meeting rooms to book, etc. All these tasks can be done in isolation and fitted in around the group sessions. In fact I’d say that one further advantage of group working is that you get fewer interruptions as it’s easy to avoid these kinds of distractions and people generally don’t bother disrupting an entire group unless it really is urgent.

One definite downside of paring or mobbing is that course corrections can often happen so quickly because of the fast feedback loop that you can easily miss the deeper learning opportunities. While it may not feel like it at the time but being stuck on a problem and then finding a solution all on your own is a deeply rewarding experience. This gets watered down a lot when you’re in a group and replaced by a different kind of satisfaction, one of collective achievement. Sometimes the group might take the “safest” route and you’re still not convinced or quite sure and so a little spike on your own can be a good way to close the loop. We shouldn’t feel as though we’re giving up our own freedom to explore by working with other people as we are still individuals with our own thoughts and opinions.

Ebbs & Flows

Like everything in life one size never fits all – we never spend our entire time collaborating in only one way. In my experience the team works together (or apart) depending on the nature of the current workload. A common pattern is that the team works as a mob on any feature where there is a clear need for shared understanding, such as when key design decisions need to be made. Once the groundwork is done the team might then be equally comfortable working in smaller groups, nominally pairs, for the majority of the feature work. Finally when there are trivial tasks or other quests which are intentionally personal then the team members act more autonomously, although still with any necessary checks and balances.

Consequently the team will likely ebb and flow over time, coming together as one for a while, then working in smaller groups with occasional pockets of time being individuals and then back around the loop again. During this natural flow there may also be more spontaneous larger collaborations as support issues arise or something unexpected turns up that needs a realignment of the team’s understanding.

Aside from all the other benefits of breaking the work down into small units, it makes it much easier for a team to continually move around and work with different people. Even being able to mob on a problem for a few hours can really help share the knowledge around. The morning huddle or stand-up has proved to be incredibly useful here as not only do you have an opportunity to reprioritise the backlog but it gives a natural point for team members to decide on who to pair or mob with. If a large part of the team has been mobbing on a single problem, perhaps for a few days, this also provides a natural checkpoint to question whether a “mob” is still the right approach or whether the returns are beginning to diminish. This really aids transparency and shows that the team is behaving diligently.

Epilogue

It is interesting how often the argument that programming is a solitary act and most programmers are introverts comes up on the social networks. It’s also interesting to see the stats about how many projects fail because they didn’t deliver what the customer really wanted. One has to wonder if those two are in any way correlated...

In my experience there are very few true introverts, often they are the product of working in an environment that moulded them that way. But with the right attitude from the team and the organisation they soon realise that working collaboratively does not mean having someone nit picking at every little mistake they make but having another pair shoulders to stand on so that they can see further and ultimately achieve more.

References

[1] The Downs and Ups of Being an ACCU Member, C Vu 25-1, Chris Oldwood
http://www.chrisoldwood.com/articles/the-downs-and-ups.html

[2] Be Available, Not Busy, C Vu 29-1, Chris Oldwood,
http://www.chrisoldwood.com/articles/be-available-not-busy.html

[3] Conceptual Integrity, C2,
http://wiki.c2.com/?ConceptualIntegrity

[4] Chapter 0, Don’t sweat the small stuff, C++ Coding Standards: 101 Rules, Guidelines, and Best Practices, Andrei Alexandrescu & Herb Sutter, ISBN: 0-321-11358-6

[5] Documenting Architecture Decisions, Michael Nygard,
http://thinkrelevance.com/blog/2011/11/15/documenting-architecture-decisions

[6] In The Toolbox – Software Archaeology, C Vu 26-1, Chris Oldwood,
http://www.chrisoldwood.com/articles/in-the-toolbox-software-archaeology.html

[7] The Cathedral and the Bazaar, Eric S. Raymond, 1999, ISBN 1-565-92724-9.

[8] ACCU London, September 2010, Jon Jagger’s Coding Dojo, reviewed by Chris Oldwood,
http://www.chrisoldwood.com/articles/accu-london-september-2010.html

[9] The Psychology of Computer Programming, Gerald M. Weinberg, 1971, ISBN 0-442-29264-3

[10] Afterwood, Overload 135, Chris Oldwood,
https://accu.org/index.php/articles/2298

[11] Application structure: Concepts & Features, Oren Eini,
https://ayende.com/blog/3895/application-structure-concepts-features

Chris Oldwood
10 August 2018

Biography

Chris is a freelance programmer who started out as a bedroom coder in the 80’s writing assembler on 8-bit micros. These days it's enterprise grade technology in plush corporate offices. He also commentates on the Godmanchester duck race and can be easily distracted via gort@cix.co.uk or @chrisoldwood.