Increasing single instruction multiple data (SIMD) capabilities in modern hardware allows for compiling efficient data-parallel query pipelines. This means GPU-alike challenges arise: control flow divergence causes underutilization of vector-processing units. In this paper, we present efficient algorithms for the AVX-512 architecture to address this issue. These algorithms allow for fine-grained assignment of new tuples to idle SIMD lanes. Furthermore, we present strategies for their integration with compiled query pipelines without introducing inefficient memory materializations. We evaluate our approach with a high-performance geospatial join query, which shows performance improvements of up to 35%.

doi.org/10.1145/3211922.3211928
International Workshop on Data Management on New Hardware
Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands

Lang, H., Kipf, A., Passing, L., Boncz, P., Neumann, T., & Kemper, A. (2018). Make the most out of your SIMD investments: Counter control flow divergence in compiled query pipelines. In 14th International Workshop on Data Management on New Hardware, DaMoN 2018. doi:10.1145/3211922.3211928