The Pain Of Technical Debt
When I was in the early stages of developing what eventually became Readable.com, my priorities were not the same as they are today. That's why I've spent an inordinate amount of the last few months paying down technical debt.
What Is "Technical Debt"?
Technical debt is a natural and inevitable part of the development process. It is unavoidable, cummulative, and incredibly desctructive when it gets out of hand. Essentially, technical debt is decisions that are made on one day, which later turn out to need revisiting.
For example, if you were building a web page, you might not think it's worth supporting some browser or other. Maybe you're too busy, maybe you don't think enough people will ever use it - either way, your site doesn't work on that browser. But two years later, the World has moved on and now that browser has a chunk of the market. Your site doesn't work properly and you need to fix it. Worse, you've been building on top of the first round of work for two years. You're going to have to fix your site, including all the work you did in the meantime.
This is technical debt. It's work you have incurred on yourself. Sometimes, as with the example above, it will be the result of an actual overt decision. Sometimes it will be the result of an ommission. Either way, the longer you leave rectifying it, the greater the cost. Like "normal" debt, technical debt accrues interest.
It's important to note that technical debt is often talked about negatively, as though it was always an indication of laziness or poor planning, but that's not the case. No project can avoid it, because it's impossible to know everything in advance - customers will ask for unforseen features, software evolves, and development processes always place some limits on how many potential-but-unlikely futures you can accommodate in your work.
What Happened To Readable?
As with a lot of technical debt, this particular example relates to a decision made to speed up work at an early stage, knowing that it would create work later. This isn't uncommon - product development is about proving the concept at the start, not necessarily about building a long-term solution that will support large-scale usage.
The first version of Readable had text sent by a browser to a server to score. The results were returned as a JSON-encoded array. So far, so good. But eventually, we started processing files. The file processing was asynchronous - you uploaded a file, and it was scored, and then at some later time you viewed the results. The simplest and fastest way to make that happen was to store the results as JSON, and then throw them into the existing display system when the user returned to view them.
Some of you may have face-palmed already, seeing where this is heading.
This solution was crude, but quick to roll out. It also worked fine for simple Word files, and small CSVs of data. In fact, it worked - slowly - for CSVs of 100,000 rows or so. That covered 99.9% of our usage.
And then we started processing websites. And we starting digging into the data. Now, instead of 100,000 rows of simple text, we were evaluating millions at once. Our storage needs ballooned. Our database server started overloading trying to manage large chunks of text that the software isn't built to handle. The time to extract interesting reports started growing.
To stretch the analogy even further, our technical debt had been passed to collections.
How Do You Fix It?
There's no easy fix with most technical debt. You have to revisit the original problem decision, spec out a solution, and write it. You'll need to correct both the original problem and anything else built on that shaky foundation. If your automated testing is on point, you'll be grateful to yourself for putting in the time to write tests. If not, you'll curse yourself for the lack of foresight.
The solution in our case is obvious enough, and indeed was way back when I created the original problem - we needed to store results data in a proper database table, with text stored as text files, and so on. And so that is what I've been working on for what feels like an eternity. Fortunately, I had created a decent set of unit tests, but not a complete set, so it was slow going. And I had built a monstrous amount on top of that shaky foundation, so the cascade of work was immense.
If I had written the original system to handle this long-term storage properly, I may have avoided this problem now. But I may not have rolled out the product when I did, and it may never have reached the "viable" in "minimum viable product". And I have had to put interesting development of new features on the back burner while I tackled this problem.
It's been a frustrating few months, all the more because I knew from day one that it might be required work one day. There's little doubt in my mind that I did the right thing in the first place with a quick solution, and completely the wrong thing leaving it so long to fix once I knew a fix was inevitable.