These rows are made for sorting and that's just what we'll do

Kuiper, Laurens; Mühleisen, Hannes

doi:10.1109/ICDE55515.2023.00159

L.N. Kuiper (Laurens) and H.F. Mühleisen (Hannes)

2023-07-26

These rows are made for sorting and that's just what we'll do

Presented at the 2023 IEEE 39th International Conference on Data Engineering (ICDE) (April 2023), Anaheim, CA, USA

Sorting is one of the most well-studied problems in computer science and a vital operation for relational database systems. Despite this, little research has been published on implementing an efficient relational sorting operator. In this work, we explore the design space of sorting in a relational database system. We use micro-benchmarks to explore how to sort relational data efficiently in analytical database systems, taking into account different query execution engines as well as row and columnar data formats. We show that, regardless of architectural differences between query engines, sorting rows is almost always more efficient than sorting columnar data, even if this requires converting the data from columns to rows and back. Sorting rows efficiently is challenging for systems with an interpreted execution engine, as their implementation has to stay generic. We show that these challenges can be overcome with several existing techniques. Based on our findings, we implement a highly optimized row-based sorting approach in the DuckDB open-source in-process analytical database management system, which has a vectorized interpreted query engine. We compare DuckDB with four analytical database systems and find that DuckDB's sort implementation outperforms query engines that sort using a columnar data format.

Additional Metadata
Keywords	Database query processing, Relational databases, Sorting
Persistent URL	doi.org/10.1109/ICDE55515.2023.00159
Conference	2023 IEEE 39th International Conference on Data Engineering (ICDE)
Organisation	Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Kuiper, L., & Mühleisen, H. (2023). These rows are made for sorting and that's just what we'll do. In Proceedings of the IEEE International Conference on Data Engineering (ICDE) (pp. 2050–2062). doi:10.1109/ICDE55515.2023.00159

Full Text ( Author Manuscript , 644kb )

These rows are made for sorting and that's just what we'll do

Publication

Publication

Address

CWI researchers

Questions or comments?

These rows are made for sorting and that's just what we'll do

Publication

Publication

Workflow

Workflow

Add Content