Most problems in software engineering can be solved with the simple application of simple tools. A much smaller set of problems are so brutally, insanely difficult that their solution requires the complex application of complex tools (when they can be solved at all). That leaves the broad middle ground: the problems that, while they aren’t Google- or NASA-hard, also aren’t going to be solved by piping some output into grep or adding a missing index or banging out 100 lines of Python. Solving one of these hard problems requires introducing some complexity, and there are two broad options for where to put it: in the tool you choose, or in the way you use that tool. In other words, solving a hard problem generally requires the simple use of a complex tool, or complex use of a simple tool.
For example, if a database for your application has outgrown the largest hardware practically available to you, you can either:
- migrate to a distributed data store, which takes your nice simple query and translates it in the background into a scatter / gather pattern across multiple machines, optimizing for efficiency and reliability, before handing you back your nice simple results (simple use of a complex tool)
- shard your database on some application-specific field, write logic into your client to use the correct shard for a given request, and teach the systems that can’t reasonably confine themselves to a shard per request (e.g. reporting systems) how to aggregate across shards (complex use of a simple tool)
So which should you choose?
This being software engineering, there is no one-size-fits-all answer, but there are generally applicable tradeoffs to make and common biases to look out for.
The complex use of a simple tool will often generate more immediate bugs, but the bugs will be easier to understand and to fix (by which I mean fixing both the bug itself and the data it might have corrupted during its lifetime).
The complex tool will often “just work,” but when it goes wrong, look out: just understanding what happened, let alone fixing it, is often a nightmare, because typically there is more overall complexity inherent in the simple use of the complex tool than the complex use of the simple tool.
That is, you generally write just as much complexity into your use of a simple tool as is required to solve your particular problem, while the complex tool, being a general utility, has to be concerned with handling general cases where fewer assumptions can be made and the edge cases are (usually) harder than you’re currently solving. You can’t see all this complexity, because you didn’t write the tool, but it’s there, and like all complexity, it is fertile breeding ground for bugs.
Development and Operations
Cycle times for development tend to be lower for simple tools, but more cycles are usually needed to get v1 of your complex use of the simple tool off the ground. For example, it’s generally easier to run a simple database on your dev machines than it is a distributed data store, but it takes longer to write v1 of your sharding library than v1 of the simple queries against the distributed data store.
After the initial implementation, operations costs tend to be lower for the simple tool, even when you’re pushing it in complex ways.
Beware Your Biases
Both solutions–the complex use of the simple tool and simple use of the complex tool–involve a bunch of ugly complexity, but only one looks ugly to you, because you only write (and therefore see) the complex code for the simple tool case.
Since it’s generally easier to get a full v1 off the ground with the complex tool, it’s easy to think of it as cheaper, but these tools tend to be complex in the sense that a small later change to requirements can create huge ripple effects. E.g. you suddenly require, for business reasons, transactional integrity across several records that your distributed data store simply cannot offer, while your sharded database solution simply requires you to co-locate those records on the same shard. There’s often a much better worst case to support such changes for the complex use of the simple tool, since again, you are generally making fewer complex choices and tradeoffs overall, and those choices and tradeoffs are closer to the actual application that contains your business logic.
Generally you should probably bias towards the complex use of a simple tool, and require at least a few of the following to change your mind:
- you have already been operating the complex tool for other uses, with success, for a reasonable amount of time–long enough that you have seen and recovered from some failures / bugs of the complex system
- the complex use of the simple tool starts to require you to implement distributed transaction logic, crypto, or other similarly difficult pieces of code on your own
- the complex use of the simple tool requires an order of magnitude higher operational cost
- an increase in scale is not just certain but sharp and imminent (be careful with this one, as “imminent” does not mean “everyone in the business is so sure it’s about to happen with the release of this new feature or ad campaign”)
- you can hire one of the authors of the complex tool to come work for your company
- you have already tried to use the simple tool in a complex way to solve this problem, more than once, and you’ve retreated in defeat, as have other engineers on your team