Yes, I know – not everyone believes database benchmarks are useful. My position is that there is value in benchmarks’ role in helping engineers wring out bottlenecks, bugs and performance impediments in their products. In a conversation I recently had with Berni Schiefer, Distinguished Engineer and manager of the DB2 UDB Performance and Advanced Technology team, he confided that “every time we run [TPC-C] we are astonished at how effectively it hammers every element of the system. We always find bugs, room for tuning. It’s the nastiest most punishing combination there is.”
It’s particularly useful as features are added, OS platforms evolve, hardware improves, memory replaces disk, partitioning schemes change…well, you get the idea. All these things can drive performance – or not, and the benchmarks are a way to measure our progress. How applicable they are to any individual customer’s needs, situation-specific requirements or platform as configured is, frankly, not the point. Of course you must test. Always. On your data, your platform, your apps. But benchmarks do manifest continuing investment by vendors and progress in the state of the art, and over the long run, continued leadership means something.
I was reminded of the topic again when I ran into IBM’s Conor O’Mahony, author of one of my favorite blogs, at an event where I spoke on behalf of my client IBM this week. Conor wasn’t there to talk about TPCs, or SAP benchmarks, but both were used as a point in one of the presentations. Having read Conor’s great discussions before, I browsed over to his blog, always one of my favorite information sources. He had a great slide which I’m reproducing here. It reinforces one of my perennial themes – the value of continuing, focused, multi-disciplinary R&D. The story it tells is very useful.
As of August 25, when Conor posted, IBM’s DB2 had a sizable lead in “days of benchmark leadership” for several benchmarks since 2003: TPC-C for transaction processing; one of several sizes of TPC-H for analytics (10 TB, a size that would cover a majority of the data warehouses in the wild); and the SAP 3-tier benchmark, another one that has several variants.
Why was I interested in revisiting the topic? Partly because IBM broke a barrier some of us were waiting on for a while – the 10 million tpmC wall. Conor’s commentary is amusing and not overly snarky; IBM in general does not typically trumpet its leading status with giant graphs for months at a time in advertising when it has the opportunity to, so he can perhaps be excused for a little gloating.
As is often the case, important advances in several dimensions made the 10M tpmC result possible. Schiefer said, “This giant result was done essentially on pure flash storage. We had to make enhancements just to make it possible to attach that many.”And for me, that’s precisely the point. Vendors continually talk about technology breakthroughs. One of the best ways to demonstrate how effectively they leverage them is these benchmarks. So I, for one, will be watching for benchmarks with IBM’s PureScale platform. And Oracle’s Exadata. And Microsoft’s SQL Server Parallel Data Warehouse when it ships.
Will I get my wish? We’ll see. In the meantime, next time I see you, let’s hoist a couple and argue about benchmarks.
Disclosures: IBM and Oracle are clients of IT Market Strategy. Microsoft is not.