The data cyclotron query processing scheme
Presented at the International Conference on Extending Database Technology, Lausanne, Switzerland
Distributed database systems exploit static workload characteristics to steer data fragmentation and data allocation schemes. However, the grand challenge of distributed query processing is to come up with a self-organizing architecture, which exploits all resources to manage the hot data set, minimize query response time, and maximize throughput without global co-ordination. In this paper, we introduce the Data Cyclotron architecture which addresses the challenges using turbulent data movement through a storage ring built from distributed main memory capitalizing modern remote-DMA facilities. Queries assigned to individual nodes interact with the Data Cyclotron by picking up data fragments continuously flowing around, i.e., the hot set. Each data fragment carries a level of interest (LOI) metric, which represents the cumulative query interest as the fragment passes around the ring multiple times. A fragment with a LOI below a given threshold, inversely proportional to the ring load, is pulled out to free up resources. This threshold is dynamically adjusted in a distributed manor based on ring characteristics and query needs. It optimizes the resource utilization keeping the average data access delay low. The proposed architecture has a modest impact on existing query execution engines. This is illustrated using an extensive validated simulation study for the Data Cyclotron protocols. The results underpin their robustness in turbulent workload scenarios as well as in the TPC-H scenario. Furthermore, we think that using state-of-the-art network technology, e.g., RDMA, could lead to even more promising results. The Data Cyclotron architecture opens a new vista for modern distributed database architectures with a plethora of research challenges barely scratched upon.
|International Conference on Extending Database Technology|
Pereira Goncalves, R.A, & Kersten, M.L. (2010). The data cyclotron query processing scheme. In Proceedings of International Conference on Extending Database Technology 2010 (EDBT 13) (pp. 75–86). MIT Press.