Exploiting the Power of Relational Databases for Efficient Stream Processing

Liarou, Erietta; Pereira Goncalves, Romulo Antonio; Idreos, Stratos

E. Liarou (Erietta), R.A. Pereira Goncalves (Romulo Antonio) and S. Idreos (Stratos)

2009

Exploiting the Power of Relational Databases for Efficient Stream Processing

Presented at the International Conference on Extending Database Technology, Saint Petersburg, Russia

Stream applications gained significant popularity over the last years that lead to the development of specialized stream engines. These systems are designed from scratch with a different philosophy than nowadays database engines in order to cope with the stream applications requirements. However, this means that they lack the power and sophisticated techniques of a full fledged database system that exploits techniques and algorithms accumulated over many years of database research. In this paper, we take the opposite route and design a stream engine directly on top of a database kernel. Incoming tuples are directly stored upon arrival in a new kind of system tables, called baskets. A continuous query can then be evaluated over its relevant baskets as a typical one-time query exploiting the power of the relational engine. Once a tuple has been seen by all relevant queries/operators, it is dropped from its basket. A basket can be the input to a single or multiple similar query plans. Furthermore, a query plan can be split into multiple parts each one with its own input/output baskets allowing for flexible load sharing query scheduling. Contrary to traditional stream engines, that process one tuple at a time, this model allows batch processing of tuples, e.g., query a basket only after $x$ tuples arrive or after a time threshold has passed. Furthermore, we are not restricted to process tuples in the order they arrive. Instead, we can selectively pick tuples from a basket based on the query requirements exploiting a novel query component, the basket expressions. We investigate the opportunities and challenges that arise with such a direction and we show that it carries significant advantages. We propose a complete architecture, the DataCell, which we implemented on top of an open-source column-oriented DBMS. A detailed analysis and experimental evaluation of the core algorithms using both micro benchmarks and the standard Linear Road benchmark demonstrate the potential of this new approach.

Additional Metadata
Keywords	Stream Processing
THEME	Information (theme 2)
Publisher	ACM
Project	Databases for personalised ubiquitous intelligent devices , Cracking a Scientific Database
Conference	International Conference on Extending Database Technology
Organisation	Database Architectures
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Liarou, E., Pereira Goncalves, R. A., & Idreos, S. (2009). Exploiting the Power of Relational Databases for Efficient Stream Processing. In Proceedings of the 11th International Conference on Extending Database Technology (pp. 323–334). ACM.

Free Full Text ( Final Version , 588kb )

Exploiting the Power of Relational Databases for Efficient Stream Processing

Publication

Publication

Address

CWI researchers

Questions or comments?

Exploiting the Power of Relational Databases for Efficient Stream Processing

Publication

Publication

Workflow

Workflow

Add Content