DataCell: Building a Data Stream Engine on top of a Relational Database Kernel
Stream applications gained significant popularity in recent years, which lead to the development of specialized datastream engines. They often have been designed from scratch and are tuned towards the specific requirements posed by their initial target applications, e.g., network monitoring and financial services. However, this also meant that they lack the power and sophisticated techniques of a full fledged database system accumulated over many years of database research. In this PhD work, we take the opposite route and design a stream engine, the DataCell, directly on top of a modern database kernel. To achieve this objective, we isolated the necessary and sufficient mechanism to support continuous query processing in a relational database environment. This led to a lightweight and orthogonal extension of SQL with a direct hook into the sophisticated algorithms and techniques of the DBMS. The streaming application can use any kind of complex query functionality without the need for us to reinvent a complete software stack, i.e., language parsers, optimizers, and storage structures. In this paper, we charter the roadmap of this thesis, the opportunities and challenges that arise with such a direction, and the significant advantages already achieved.
|Keywords||data stream processing|
|THEME||Information (theme 2)|
|Project||Cracking a Scientific Database|
|Conference||International Conference on Very Large Databases|
Liarou, E, & Kersten, M.L. (2009). DataCell: Building a Data Stream Engine on top of a Relational Database Kernel. In In Proceedings Of the VLDB PhD Workshop . PVLDB.