Whiteboard Systems

Software degrades across engineering generations

"One thing seems to be agreed by everyone, that software will be modified."

Peter Naur Programming as Theory Building

Here's a brief story about software development.

It starts with a problem. Something that needs fixing. A small, ragtag team of software engineers is brought on board to build a solution. The solution works and needs to scale. More engineers are brought on. Customers start asking for new features. The surface area of the product grows. The founding engineers leave to start something else. More engineers are brought on. The product continues to grow.

Development starts to slow. Building the application takes an hour. The tests can't run locally anymore. Bugs only get fixed if they affect enough customers. The latest technology is being tested in new repositories. Engineers start using the word "legacy". New engineers stop learning the old system. Employee churn continues. Eventually, no one really know how the whole thing works.

Degradation is not inevitable. Time will not erode away the bits that compose our systems. But most software systems that you or I are building today will be in worse shape a decade from now. In my experience, this has less to do with the passing of time and more to do with the passing of engineering generations. I think I just made up the phrase "engineering generation", so let me define it. An engineering generation is a group of people working on a project at the same time. If enough members of a team change, then an engineering generation passes. I think you see where this analogy originates.

Software rot is a complex phenomenon and changing ownership is just one contributing factor. But its an important one. A key reason why software like Python or Linux has been successfully developed across decades is because their founding engineers — Guido van Rossum and Linus Torvalds — are still guiding those projects. Those teams haven't churned through many engineering generations. On the other hand, if you've worked on a team with high churn, you're probably familiar with the phrase "no one really knows how that works anymore".

Imperfect communication

"The inaccuracy or incompleteness of these various mental models generally (and perhaps surprisingly) does not cause significant problems."

John Allspaw Recalibrating Mental Models Through Design of Chaos Experiments

As engineers work, they build an understanding of the system -- a mental model of sorts. Onboarding new teammates involves sharing this mental model. In practice, only the outline of this model is shared. Its too detailed and nuanced to share completely. Individuals are responsible for filling in the gaps. As a result, team members are unlikely to have the exact same mental model. Suprisingly, it turns out that this is mostly okay. At least in the short term.

These gaps tend to live in two places. The obvious place to find them is at the fringes of the system. This could be a part of the system where data doesn't travel frequently — an edge case. Or it could be a new part of the system that's still being built. Either way, the behavior might only be understood by one or two people or maybe no one at all. It doesn't matter too much. Ocassionally, these fringe parts resynchronize. It often happens because a bug or an outage forces a team to reevaluate their understanding.

The less obvious place to find gaps across mental models is at the core of the system. This part of the system does a lot of the heavy lifting. Most of the data flowing through the system will touch this code. And it works. It works really well. So well, in fact, that no one really has to touch it anymore. The founding generation knows this code intimately; they built it. The next generation knows this code in theory. Its been described to them. They've fixed a couple bugs. The following generations know this code in lore. They see it deep in their stack traces but know its not causing their issues. That code has been around forever.

As the generations pass, people take exisitng code for granted. They lose the deep understanding necessary to modify it. Rather than extending the core's functionality, new teams append functionality on to the system in an adhoc fashion. Why bother changing the core? Even with blameless postmortems, you wouldn't want to be responsble for causing an outage. The result is skyrocketing complexity, a system that begins to collapse under its own weight, and engineers saying "no one really knows how that works anymore".

Improving the status quo

"Software Is about Developing Knowledge More than Writing Code."

Li Hongyi How to Build Good Software

Teams with closely aligned mental models will develop better software. The software will have fewer bugs and be more resilient to changes. Fewer discrepencies across mental models means future generations are less likely to sigh that well-worn phrase. So its worth asking: How can we better keep our mental models synchronized?

I'll say up front that I don't know. Perhaps, we'll need Neuralink interfaces to seamlessly share thoughts without miscommunication. In the meantime, I have a simpler idea I'd like to try. Let's take our mental models out of our heads.

I'll admit it sounds ridiculous, but hear me out. Some benefits of bringing these models to life include the following.

Explanations must be precise. Using concrete examples to explain abstractions makes it much more difficult to handwave away the complex parts. Its possible you might find that you didn't understand the system as well as you thought.
Confidence everyone is on the same page. A visualized mental model leaves no room for ambiguity. You're not left wondering if your audience understood your explanation.
Drawn out mental models are more accessible to both junior engineers and non-technical teammates. Product managers, designers, and the customer success team all rely on a solid technical understanding of the product.

The whiteboard.systems app is a tool that attempts to help engineers visualize their mental models in a remote-first world. It offers an alternative to keeping these models hidden away in our heads. The app takes an opinionated approach about what details are necessary when describing a mental model (just boxes and arrows). This enables an extremely simple interface. Users can learn the tool in seconds and simultanesouly describe their model as their building it. Give it a try and let me know how it goes!

Want to try it out?

Go to App

"No one really knows how that works anymore"

Vivek Dasari

Software degrades across engineering generations

Imperfect communication

Improving the status quo

Want to try it out?

Interested in learning more?