This paper partially explores the design space for efficient query processors on future hardware that is rich in SIMD capabilities. It departs from two well-known approaches: (1) interpreted block-at-a-time execution (a.k.a. "vectorization") and (2) "data-centric" JIT compilation, as in the HyPer system. We argue that in between these two design points in terms of granularity of execution and unit of compilation, there is a whole design space to be explored, in particular when considering exploiting SIMD. We focus on TPC-H Q1, providing implementation alternatives ( "flavors") and benchmarking these on various architectures. In doing so, we explain in detail considerations regarding operating on SQL data in compact types, and the system features that could help using as compact data as possible. We also discuss various implementations of aggregation, and propose a new strategy called "in-register aggregation" that reduces memory pressure but also allows to compute on more compact, SIMD-friendly data types. The latter is related to an in-depth discussion of detecting numeric overflows, where we make a case for numeric overflow prevention, rather than detection. Our evaluation shows positive results, confirming that there is still a lot of design headroom.

Eighth International Workshop on Accelerating Analytics and Data Management Systems Using Modern Processor and Storage Architectures, ADMS 2017
Database Architectures

Gubner, T., & Boncz, P. (2017). Exploring query execution strategies for JIT vectorization and SIMD. In Proceedings of the International Workshop on Accelerating Analytics and Data management Systems Using Modern Processor and Storage Architectures (pp. 9–17).