In this short paper we outline the Data Vault, a database-attached external file repository. It provides a true symbiosis between a DBMS and existing file-based repositories. Data is kept in its original format while scalable processing functionality is provided through the DBMS facilities. In particular, it provides transparent access to all data kept in the repository through an (array-based) query language using the file-type specific scientific libraries. The design space for data vaults is characterized by requirements coming from various fields. We present a reference architecture for their realization in (commercial) DBMSs and a concrete implementation in MonetDB for remote sensing data geared at content-based image retrieval.

Data Management, Integration and Knowledge Discovery,for Earth Observation Applications , The SciLens Infrastructure for Data Intensive Research
International Conference on Scientific and Statistical Database Management
Database Architectures

Ivanova, M., Kersten, M., & Manegold, S. (2012). Data Vaults: A Symbiosis between Database Technology and Scientific File Repositories. In Proceedings of International Conference on Scientific and Statistical Database Management 2012.