Understanding, Modeling, and Improving Main-Memory Database Performance
During the last two decades, computer hardware has experienced remarkable developments. Especially CPU (clock-)speed has been following Moore's Law, i.e., doubling every 18 months; and there is no indication that this trend will change in the foreseeable future. Recent research has revealed that database performance, even with main-memory based systems, can hardly benefit from the ever increasing CPU power. The reason for this is that the performance of other hardware components has not been increasing at the same rate as CPU speed. While memory bandwidth has been growing steadily, memory latency has hardly changed. Thus, random memory access has become a major bottleneck for database query processing. This thesis analyzes the impact of modern hardware on main-memory database performance and develops new techniques to better exploit the available hardware resources. Using simple benchmarks, we show that --- unless special care is taken --- database algorithms can spend up to 90% of their time waiting for memory. Exhaustive experiments reveal that memory access is a major bottleneck for database performance on almost any hardware platform, ranging from small of-the-shelf PCs to large high-performance servers. The insight gained allows us to design detailed cost models to predict the performance behavior of database algorithms by estimating the number of performance-relevant events, such as cache misses and TLB misses, and scoring them by their respective cost, i.e., their latency. Focusing on joins, we develop new cache-conscious algorithms. The main idea is to restrict random data access to data that fits into the (smallest) CPU cache. Our cost models allow us to automatically tune our algorithms to achieve optimal performance on various hardware platforms. Further analysis shows that even with minimized memory access costs, database algorithms cannot exploit the full potential of modern super-scaler CPUs. We discuss various implementation techniques to improve the efficiency.
|ACM||Systems (acm H.2.4)|
|THEME||Information (theme 2)|
|Promotor||M.L. Kersten (Martin)|
|Degree Grantor||Universiteit van Amsterdam|
|Series||SIKS Dissertation Series ; 2002-17|
Manegold, S. (2002, December 17). Understanding, Modeling, and Improving Main-Memory Database Performance (No. 2002-17). SIKS Dissertation Series.