Data Blocks: hybrid OLTP and OLAP on compressed storage using both vectorization and compilation
Presented at the ACM SIGMOD International Conference on Management of Data, San Francisco
This work aims at reducing the main-memory footprint in high performance hybrid OLTP & OLAP databases, while retaining high query performance and transactional throughput. For this purpose, an innovative compressed columnar storage format for cold data, called Data Blocks is introduced. Data Blocks further incorporate a new light-weight index structure called Positional SMA that narrows scan ranges within Data Blocks even if the entire block cannot be ruled out. To achieve highest OLTP performance, the compression schemes of Data Blocks are very light-weight, such that OLTP transactions can still quickly access individual tuples. This sets our storage scheme apart from those used in specialized analytical databases where data must usually be bit-unpacked. Up to now, high-performance analytical systems use either vectorized query execution or “just-in-time” (JIT) query compilation. The fine-grained adaptivity of Data Blocks necessitates the integration of the best features of each approach by an interpreted vectorized scan subsystem feeding into JIT-compiled query pipelines. Experimental evaluation of HyPer, our full-fledged hybrid OLTP & OLAP database system, shows that Data Blocks accelerate performance on a variety of query workloads while retaining high transaction throughput.
|Snowflake Computing, San Mateo, CA, USA|
|Actian CWI Research Grant|
|ACM SIGMOD International Conference on Management of Data|
Lang, H, Mühlbauer, T, Funke, F, Boncz, P.A, Neumann, T, & Kemper, A. (2016). Data Blocks: hybrid OLTP and OLAP on compressed storage using both vectorization and compilation. doi:10.1145/2882903.2882925