Transferring a large amount of data from a database to a client program is a surprisingly expensive operation. The time this requires can easily dominate the query execution time for large result sets. This represents a significant hurdle for external data analysis, for example when using statistical software. In this paper, we explore and analyse the result set serialization design space. We present experimental results from a large chunk of the database market and show the inefficiencies of current approaches. We then propose a columnar serialization method that improves transmission performance by an order of magnitude.

, ,
doi.org/10.14778/3115404.3115408
Process mining for multi-objective online control , Capturing the Laws of Data Nature
USENIX Security Symposium
,
Database Architectures

Raasveldt, M., & Mühleisen, H. (2017). Don't hold my data hostage - A case for client protocol redesign. In Proceedings of the VLDB Endowment (pp. 1022–1033). doi:10.14778/3115404.3115408