Technical Debt And Refactoring

Here's my attempt at bringing to life the “Technical Debt” analogy with a simple comparison to the manufacturing industry. This was written at The Economist, to help non-technical folks understand why refactoring code is so important, especially when doing agile. Repeatedly prioritising short-term wins, without stopping to consider a codebase's architecture, results in an unmaintainable system.

Legacy Motors

The manufacturing of cars at Legacy Motors (LM) was all done by a team of talented engineers and designers, who hand-made each vehicle to an exact customer specification. All was well, until one day, the shareholders, wanting to expand business, came up with a plan to increase sales: they decided to invest in an assembly line. By getting the team to build machines that build cars, instead of having the engineers build the cars themselves, they could greatly increase their CPM (Cars Per Month) and vastly expand business.

Robot car

So the engineers set about building the assembly line robots. They struggled a bit, as they were automotive engineers, not robot engineers, but they muddled through and managed to get a working assembly line going. There was much celebration as the first cars finally rolled off the line, and the project was declared a success. CPM doubled within the first month, and the shareholders of Legacy Motors were thrilled.

As the customers started using their cars, they started asking for new features and modifications. The managers listened diligently to the customer's requirements, then went to the engineers and designers with the high priority plans:

“Our customers want to be able to choose from 3 different wheel sizes on their cars”

And of course, the shareholders wanted to double the CPM to increase revenue even more. The engineers, excited by the possibilities of the new assembly line, were thrilled with the prospect of improving their robots to handle the new requirements. However, there was the small problem of time. The shareholders had promised to deliver double the number of cars by next month, and all of them should have a choice of three wheel sizes.

The engineers, never scared of a difficult problem, came up with a solution that could achieve the shareholder goals. The original robot was only designed to attach one type of wheel. To rebuild it would take much more than the 1 month that they had for the job. So instead, they built a second robot, further down the assembly line, that would unscrew the original set of wheels, and replace them with one of the two other custom wheel sizes, if that's what the customer's order specified.

In order to double the CPM, they cranked up the speed of the conveyor belt that moves the cars from robot to robot. What they found, however, was that the original robots were not designed to handle this pace. The engineers calculated that to adjust all the robots to the new pace would take at least 2 months, so instead, they designed another, new robot at the end of the assembly line.

This new robot was specifically designed to find and fix manufacturing defects created by the other robots when the line was running at double speed. It worked. Well, most of the time. What they found was that it wasn't perfect. It required dedicated engineers to watch it carefully, all the time, and manually fix any problems that the new robot didn't find, as the mistakes made by the first robots were unpredictable, at best.

Car Robot Production

Over time, the assembly line grew bigger and bigger. But there was never enough time allocated by the shareholders to rebuild the original robots, that were way past their recommended lifetimes. There was, however, always time to build a new robot. There were robots doing all sorts of tasks, such as undoing other robot's work, fixing problems created by robots, and adding or changing components to the cars, sometimes multiple times.

More time passed, and the engineers found that they couldn't react quickly anymore to the shareholders requests. They also found that there were more and more cars with defects going out to customers, because there were too many robots to maintain. All the engineers were working around the clock just trying to keep the line going, as old robots would frequently malfunction, so they no longer had time to even build new robots.

The electricity and robot oil bills were through the roof, as each robot required power and lubrication to run. The old ones were especially inefficient, as they required a type of oil that was no longer commonly used and thus very expensive.

Car Crash

Progress at LM had completely stalled while competitors in far away countries were turning out new, innovative cars all the time. The engineers found that they couldn't replace the old robots even if they wanted to, because replacing an old one required replacing every single robot on the assembly line that was designed to react to and fix the first one's mistakes. They all depended on each other, and the system was far too complicated to maintain.

Software development works in a similar manner. It's always possible to make a quick, hackish fix in less time than re-purposing a system to elegantly handle the new requirements. However, doing this causes the long-term maintainability of the product to dwindle. Eventually, even quick hacks become impossible as the complexity of the system makes it unbearably difficult to change.

In business, there is always a tradeoff, and responding to the market often requires immediate action in order to stay competitive. The trick is to do this sparingly, and always dedicate time and resources to proper software development.