DuckPGQ: Efficient property graph queries in an analytical RDBMS
In the past decade, property graph databases have emerged as a growing niche in data management. Many native graph systems and query languages have been created, but the functionality and performance still leave much room for improvement. The upcoming SQL:2023 will introduce the Property Graph Queries (SQL/PGQ) sub-language, giving relational systems the opportunity to standard- ize graph queries, and provide mature graph query functionality. We argue that (i) competent graph data systems must build on all technology that makes up a state-of-the-art relational system, (ii) the graph use case requires the addition to that of a many- source/destination path-finding algorithm and compact graph rep- resentation, and (iii) incites research in practical worst-case-optimal joins and factorized query processing techniques. We outline our design of DuckPGQ that follows this recipe, by adding efficient SQL/PGQ support to the popular open-source “embeddable analytics” relational database system DuckDB, also originally developed at CWI. Our design aims at minimizing techni- cal debt using an approach that relies on efficient vectorized UDFs. We benchmark DuckPGQ showing encouraging performance and scalability on large graph data sets, but also reinforcing the need for future research under (iii).
|CIDR 2023 : Conference on Innovative Data Systems Research|
ten Wolde, D.L.J,, Singh, T, Szárnyas, G, & Boncz, P.A. (2023). DuckPGQ: Efficient property graph queries in an analytical RDBMS. In Proceedings of the Conference on Innovative Data Systems Research.