This paper analyzes the performance of concurrent (index) scan operations in both record (NSM/PAX) and column (DSM) disk storage models and shows that existing scheduling policies do not fully exploit data-sharing opportunities and therefore result in poor disk bandwidth utilization. We propose the Cooperative Scans framework that enhances performance in such scenarios by improving data-sharing between concurrent scans. It performs dynamic scheduling of queries and their data requests, taking into account the current system situation. We first present results on top of an NSM/PAX storage layout, showing that it achieves significant performance improvements over traditional policies in terms of both the number of I/Os and overall execution time, as well as latency of individual queries. We provide benchmarks with varying system parameters, data sizes and query loads to confirm the improvement occurs in a wide range of scenarios. Then we extend our proposal to a more complicated DSM scenario, discussing numerous problems related to the two-dimensional nature of disk scheduling in column stores.

International Conference on Very Large Data Bases
Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands

Zukowski, M., Héman, S., Nes, N., & Boncz, P. (2007). Cooperative scans: Dynamic bandwidth sharing in a DBMS. In 33rd International Conference on Very Large Data Bases, VLDB 2007 - Conference Proceedings (pp. 723–734).