Vectorization vs. compilation in query execution
Compiling database queries into executable (sub-) programs provides substantial benefits comparing to traditional interpreted execution. Many of these benefits, such as reduced interpretation overhead, better instruction code locality, and providing opportunities to use SIMD instructions, have previously been provided by redesigning query processors to use a vectorized execution model. In this paper, we try to shed light on the question of how state-of-the-art compilation strategies relate to vectorized execution for analytical database workloads on modern CPUs. For this purpose, we carefully investigate the behavior of vectorized and compiled strategies inside the Ingres VectorWise database system in three use cases: Project, Select and Hash Join. One of the findings is that compilation should always be combined with block-wise query execution. Another contribution is identifying three cases where "loop-compilation" strategies are inferior to vectorized execution. As such, a careful merging of these two strategies is proposed for optimal performance: either by incorporating vectorized execution principles into compiled query plans or using query compilation to create building blocks for vectorized processing.
|THEME||Information (theme 2)|
|Conference||DAMON Workshop (colocated with ACM SIGMOD)|
Sompolski, J, Zukowski, M, & Boncz, P.A. (2011). Vectorization vs. compilation in query execution. In Proceedings of DAMON Workshop (colocated with ACM SIGMOD) 2011 (pp. 33–40).