2007
Efficient and Flexible Information Retrieval Using MonetDB/X100
Publication
Publication
Today's large-scale IR systems are not implemented using general-purpose database systems, as the latter tend to be significantly less efficient than custom-built IR engines. This paper demonstrates how recent developments in hardwareconscious database architecture may however satisfy IR needs. The advantage is flexibility of experimentation, as implementing a retrieval system on top of a DBMS boils down to relational query formulation, rather than system programming. We demonstrate in the context of the TeraByte TREC efficiency task that our experimental MonetDB/X100 database system provides highly competitive results both regarding precision and speed. We analyze the two innovations in MonetDB/X100 that most contributed to this successful application of DB technology in IR, namely vectorized incache processing and the use of two new light-weight compression schemes that work between the RAM and CPU cache memory levels.
Additional Metadata | |
---|---|
G. Weikum , J. Hellerstein , M. Stonebraker | |
Ambient Multimedia Databases | |
Biennial Conference on Innovative Data Systems Research | |
Organisation | Database Architectures |
Héman, S., Zukowski, M., de Vries, A., & Boncz, P. (2007). Efficient and Flexible Information Retrieval Using MonetDB/X100. In G. Weikum, J. Hellerstein, & M. Stonebraker (Eds.), Proceedings of the Biennial Conference on Innovative Data Systems Research (CIDR) (pp. 96–101). |