The MemSQL web page is not very explicit about it, but after some testing we tend to believe that MemSQL aims at OLTP workloads, so our tests might be unfavorable for MemSQL, as we tried out TPC-H SF1, an OLAP workload. We used the Comunity Edition 4.0.28 for that, and ran MemSQL out of the box. Which worked very pleasantly by the way, installation was completely hassle free.
What was surprising was that while some queries were fast, some queries were very slow. Below are the runtimes for the first 5 TPC-H queries, running on a six core i7-3930K, compared with HyPer v0.5-332.
Query | MemSQL | HyPer |
---|---|---|
Q1 | 0.64s | 0.013s |
Q2 | 45.92s | 0.001s |
Q3 | 326.94s | 0.013s |
Q4 | 0.56s | 0.007s |
Q5 | 120.50s | 0.008s |
After investigating the plans with explain we saw that apparently MemSQL always uses either index-nested-loop-joins (INL) or nested-loop-joins (NL), which is very expensive in large, TPC-H-style join queries. The INL is ok, although still somewhat expensive, as seen in Q4, where only INL is used, but if the system is forced to use NL, performance is very poor.
Therefore our assumption that the system aims at OLTP settings, where large joins do not occur. Unfortunately there is no easy way to run TPC-C on MemSQL (we would have to implement a benchmark driver first), therefore we stoped here with just some first impressions. Perhaps we can get a student to implement an OTLP workload for both systems, we will publish an update then.
An interesting side aspect is the compilation model. HyPer generates machine code using the LLVM compiler backend, while MemSQL generates C++ code and passes it through GCC. That is particular interesting because we can look at the generated C++ code and see what the system is doing (and the generated code is quite readable). A downside of generating C++ code is compile time: While HyPer uses in the order of 100ms to compile a TPC-H query, MemSQL uses in the order of 10s, a factor 100 slower. They mitigate that using a plan cache, but interactive query sessions can become quite painful.
But for the OLTP side the system looks quite interesting, they use fancy data structures like lock-free skip lists, and it will be interesting to see how they perform there.
Update: Robbie Walzer from MemSQL suggested to add foreign key indexes, which eliminates the expensive nested loop joins from the queries. I have included the numbers below, and I have also added results from PostgreSQL 9.4. Note that PostgreSQL is disk-based and single threaded, so the comparison is unfair, but it represents "classical" systems.
Query | MemSQL | with FK | PostgreSQL | HyPer |
---|---|---|---|---|
Q1 | 0.64s | 0.64s | 10.90s | 0.013s |
Q2 | 45.92s | 0.11s | 0.38s | 0.001s |
Q3 | 326.94s | 0.83s | 1.96s | 0.013s |
Q4 | 0.56s | 0.57s | 0.48s | 0.007s |
Q5 | 120.50s | 0.79s | 0.63s | 0.008s |
And indeed the foreign key indexes greatly improve performance, nested loop joins are just too expensive. Note, however, that the other two systems do fine without these extra indexes, so investing some effort in a hash join implementation might be worthwhile for MemSQL.
Robbie also mentioned cluster support, which we did not test at all here. I assume MemSQL scales nicely with the number of nodes in a cluster, we will test that at some later point. On the other hand even with perfect scalability you need a decent number of nodes until the single-node performance of HyPer is reached in TPC-H. Things might be very different in OLTP settings like TPC-C, of course, there the index structures of MemSQL can probably shine.