Proper physical design is a momentous issue for the performance of modern database systems and applications. Nowadays, a growing amount of applications require the execution of dynamic and exploratory workloads with unpredictable characteristics that change over time, e.g., social networks, scientific databases and multimedia databases. In addition, as most modern applications move to the big data era, investing time and resources in building the wrong set of indexes over large collections of data can severely affect performance. Offline, online and adaptive indexing are three distinct approaches to the problem of automating the physical design choices. Offline indexing is best in static environments with stable workloads. Online indexing is best in relatively dynamic environments where the query workload can be monitored. Adaptive indexing is best in fully dynamic environments where no idle time or workload knowledge may be assumed. We observe that these three approaches are complementary, while none of them can satisfy the needs of modern applications in isolation. We envision a new index selection approach, holistic indexing that excels its predecessors by combining the best features of offline, online and adaptive indexing while overcoming their weaknesses. The main goal is the creation of a database kernel that can autonomously create partial indexes which are continuously refined during query processing as in adaptive indexing but at the same time the system continuously detects any opportunity to improve the physical design offline; whenever any idle time occurs it tries to exploit knowledge gathered during query processing to refine existing indexes further or create new ones. We sketch the research space and the new challenges such a direction brings.

,
,
SIGMOD/PhD Symposium
Database Architectures

Petraki, E. (2012). Holistic Indexing: Offline, Online and Adaptive Indexing in the Same Kernel. In Proceedings of SIGMOD/PhD Symposium 2012.