There’s been a lot of back and forth lately from the NoSQL crowd around Michael Stonebreaker’s contention that reliance on relational technology and MySQL has trapped Facebook in a ‘fate worse than death.’ This was reported in a GigaOm post by Derrick Harris. Harris reports in a later post that most of the reaction to Stonebreaker’s contention was negative:
By and large, the responses weren’t positive. Some singled out Stonebraker as out of touch or as just trying to sell a product. Some pointed to the popularity of MySQL as evidence of its continued relevance. Many questioned how Stonebraker dare question the wisdom of Facebook’s top-of-the-line database engineers.
Harris, Jim Starkey, Paul Mikesell, and Curt Monash all take a stab at rehabilitating Stonebreaker’s argument in the second post. Their argument boils down to, “Yeah, Facebook did it, but only because they have great engineers, spent a fortune, and endured a lot of pain. There are easier ways.”
Sorry fellas, time to annoy the digerati again, and so soon after bashing Social Media. I disagree with their contention, which is well expressed in the article by this Jim Starkey quote:
If a company has plans for its web application to scale and start driving a lot of traffic, Starkey said, he can’t imagine why it would build that new application using MySQL.
In fact, I would argue that starting with NoSQL because you think you might someday have enough traffic and scale to warrant it is a premature optimization, and as such, should be avoided by smaller and even medium sized organizations. You will have plenty of time to switch to NoSQL as and if it becomes helpful. Until that time, NoSQL is an expensive distraction you don’t need.
The best example I see for why that’s the way to look at NoSQL comes from Netflix, which is mentioned towards the end of the article. I went through several expositions by Netflix engineers on their experience transitioning from an Oracle Relational data center to one based on NoSQL in the form of Amazon’s SimpleDB and then later Cassandra (the latter is still an ongoing transition as I understand it). You’re welcome to read the same sources, I’ve listed them at the bottom.
Netflix decided to move to the Cloud in late 2008 to early 2009 after an outage prompted them to consider what it would take to engineer their way to significantly higher up time. They concluded they couldn’t build data centers fast enough, and that as soon as one was built it was swamped for capacity and out of date. They agree with Amazon’s Werner Vogels that building data centers represented “undifferentiated heavy lifting”, and was therefore to be avoided, so they bet heavily on the Cloud. These are smart technologists who have been very transparent about their experiences, so it’s worth learning from them. Werner Vogels reaction to Stonebreaker’s remarks about Facebook are an apt way to start:
Scaling data systems in real life has humbled me. I would not dare criticize an architecture that holds social graphs of 750M and works.
The gist of the argument for NoSQL being a premature optimization is straightforward and rests on 3 points:
Point 1: NoSQL technologies require more investment than Relational to get going with.
The remarks from Netflix are pretty clear on this. From the Netflix “Tech” blog:
Adopting the non-relational model in general is not easy, and Netflix has been paying a steep pioneer tax while integrating these rapidly evolving and still maturing NoSQL products. There is a learning curve and an operational overhead.
Or, as Sid Anand says, “How do you translate relational concepts, where there is an entire industry built up on an understanding of those concepts, to NoSQL?’
Companies embarking on NoSQL are dealing with less mature tools, less available talent that is familiar with the tools, and in general fewer available patterns and know-how with which to apply the new technology. This creates a greater tax on being able to adopt the technology. That sounds a lot like what we expect to see in premature optimizations to me.
Point 2: There is no particular advantage to NoSQL until you reach scales that require it. In fact it is the opposite, given Point 1.
It’s harder to use. You wind up having to do more in your application layer to make up for what Relational does that NoSQL can’t that you may rely on. Take consistency, for example. As Anand says in his video, “Non-relational systems are not consistent. Some, like Cassandra, will heal the data. Some will not. If yours doesn’t, you will spend a lot of time writing consistency checkers to deal with it.” This is just one of many issues involved with being productive with NoSQL.
Point 3: If you are fortunate enough to need the scaling, you will have the time to migrate to NoSQL and it isn’t that expensive or painful to do so when the time comes.
The root of premature optimization is engineers hating the thought of rewriting. Their code has to do everything just exactly right the first time or its crap code. But what about the idea you don’t even understand the problem well enough to write “good” code at first. Maybe you need to see how users interact with it, what sorts of bottlenecks exist, and how the code will evolve. Perhaps your startup will have to pivot a time or two before you’ve even started building the right product. Wouldn’t it be great to be able to use more productive tools while you go through that process? Isn’t that how we think about modern programming?
Yes it is, and the only reason not to think that way is if we have reason to believe that a migration will be, to use Stonebreaker’s words, “a fate worse than death.” The trouble is, it isn’t a fate worse than death. And yes, it will help to have great engineers, but by the time you get to the volumes that require NoSQL, you’ll be able to afford them, and even then, it isn’t that bad.
Netflix’s story is a great one in this respect. They went about their NoSQL migration in a clever way. They built a bi-directional replication between Oracle and SimpleDB, and then they started moving over one app at a time. They did this against a mature system rather than a new buggy untested by users system. As a result, things went pretty quickly and pretty smoothly. That’s how engineers are supposed to work: bravo Netflix!
I have a note out to Adrian Cockcroft to ask how long it took, but already I have found a reference to Sid Anand doing the initial “forklifting” of a billion records from Oracle to Simple DB in about 9 months, and they went on from there. When Sid Anand was asked what the most complex query was to convert from Oracle to NoSQL he said, “There weren’t really any.” He went on to say you wouldn’t convert your transactional data anyway, and that was pretty much it.
The world loves to see things in black and white. It sells more papers. Therefore, because some situations benefit from NoSQL for scaling, we hear a hue and cry that everyone must embrace NoSQL immediately. Poppycock. You can go a long long way with SQL-based approaches, they’re more proven, they’re cheaper, and they’re easier. Start out there and if the horse you’re riding is strong enough to carry you to NoSQL scaling levels you can tackle that when the time comes. Meanwhile, avoid premature optimizations. You don’t have time for them. Let all these guys with NoSQL startups make their money elsewhere. You need to stay agile and focused on your next minimum viable deliverable.
Articles on the Netflix NoSQL Transition
(Cross-posted @ SmoothSpan Blog)