Adaptive query parallelization in multi-core column stores
Presented at the International Conference on Extending Database Technology (March 2016), Bordeaux, France
With the rise of multi-core CPU platforms, their optimal utilization for in-memory OLAP workloads using column store databases has become one of the biggest challenges. Some of the inherent limi- tations in the achievable query parallelism are due to the degree of parallelism dependency on the data skew, the overheads incurred by thread coordination, and the hardware resource limits. Finding the right balance between the degree of parallelism and the multi-core utilization is even more trickier. It makes parallel plan generation using traditional query optimizers a complex task. In this paper we introduce adaptive parallelization, which ex- ploits execution feedback to gradually increase the level of paral- lelism until we reach a sweet-spot. After each query has been exe- cuted, we replace an expensive operator (or a sequence) by a faster parallel version, i.e. the query plan is morphed into a faster one. A convergence algorithm is designed to reach the optimum as quick as possible. The approach is evaluated against a full-fledged column-store using micro-benchmarks and a subset of the TPC-H and TPC-DS queries. It confirms the feasibility of the design and proofs to be competitive against a statically optimized heuristic plan generator. Adaptively parallelized plans show optimal multi-core utilization and up to five times improvement compared to heuristically paral- lelized plans on the workload under evaluation.
|Commit: Time Trails (P019) , The SciLens-II Infrastructure, Big Data at work|
|International Conference on Extending Database Technology|
Gawade, M.M, & Kersten, M.L. (2016). Adaptive query parallelization in multi-core column stores. In Proceedings of the International Conference on Extending Database Technology. ACM.