Optimizing group-by and aggregation using GPU-CPU co-processing

Gomes Tomé, Diego; Gubner, Tim; Raasveldt, Mark; Rozenberg, Eyal; Boncz, Peter

D. Gomes Tomé (Diego), T.K. Gubner (Tim), M. Raasveldt (Mark), E. Rozenberg (Eyal) and P.A. Boncz (Peter)

2018-08-27

Optimizing group-by and aggregation using GPU-CPU co-processing

Presented at the International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures (August 2018), Rio de Janeiro, Brazil

While GPU query processing is a well-studied area, real adoption is limited in practice as typically GPU execution is only significantly faster than CPU execution if the data resides in GPU memory, which limits scalability to small data scenarios where performance tends to be less critical. Another problem is that not all query code (e.g. UDFs) will realistically be able to run on GPUs. We therefore investigate CPU-GPU co-processing, where both the CPU and GPU are involved in evaluating the query in scenarios where the data does not fit in the GPU memory.

As we wish to deeply explore opportunities for optimizing execution speed, we narrow our focus further to a specific well-studied OLAP scenario, amenable to such co-processing, in the form of the TPC-H benchmark Query 1.

For this query, and at large scale factors, we are able to improve performance significantly over the state-of-the-art for GPU implementations; we present competitive performance of a GPU versus a state-of-the-art multi-core CPU baseline a novelty for data exceeding GPU memory size; and finally, we show that co-processing does provide significant additional speedup over any of the processors individually.

We achieve this performance improvement by utilizing parallelism-friendly compression to alleviate the PCIe transfer bottleneck, query-compilation-like fusion of the processing operations, and a simple yet effective scheduling mechanism. We hope that some of these features can inspire future work on GPU-focused and heterogeneous analytic DBMSes.

Additional Metadata
Conference	International Workshop on Accelerating Data Management Systems Using Modern Processor and Storage Architectures
Organisation	Database Architectures
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Gomes Tomé, D., Gubner, T., Raasveldt, M., Rozenberg, E., & Boncz, P. (2018). Optimizing group-by and aggregation using GPU-CPU co-processing. In Proceedings of the International Workshop on Accelerating Analytics and Data management Systems Using Modern Processor and Storage Architectures (pp. 1–10).

Free Full Text ( Final Version , 334kb )

Optimizing group-by and aggregation using GPU-CPU co-processing

Publication

Publication

Address

CWI researchers

Questions or comments?

Optimizing group-by and aggregation using GPU-CPU co-processing

Publication

Publication

Workflow

Workflow

Add Content