Database cracking is a method to create partial indices as a side-effect of processing queries. Cracking efficiently smears out the cost of creating a full index over a stream of queries, creating an index that is overfitted to queried parts of the data. This core characteristic of cracking leads to unpredictable performance and unreliable convergence towards a full index. These problems are aggravated when considering updates and multidimensional queries. We envision a new indexing technique, Progressive Indexing that improves database cracking by strictly limiting perquery indexing cost to a budget (e.g., a user-defined fraction of scan costs), allowing the first and subsequent queries to complete without heavy penalties. At the same time, all indexing effort is spent towards predictable convergence towards a full index. We discuss different algorithms to deal with multidimensional queries and updates while maintaining a robust and convergent index, we then explore the research space and new challenges that arise from this new technique.

Additional Metadata
Series CEUR Workshop
Conference VLDB PhD Workshop
Citation
Holanda, P.T.T. (2018). Progressive indices: Indexing without prejudice. In Proceedings of the VLDB 2018 PhD Workshop co-located with the 44th International Conference on Very Large Databases (VLDB 2018).