On the symbiosis of a data mining environment and a DBMS

Kersten, Martin; Holsheimer, M.

One of the main obstacles in applying data mining techniques to large, real-world databases is the lack of efficient data management. In this paper, we outline a two-level architecture, consisting of a mining tool and a database server. Key elements in its success are a clear separation of concerns: the mining tool organizes and controls the search process, while all data-handling is performed by the parallel main memory DBMS. Data is stored as a set of binary tables. The interaction consists of queries for statistical information. Properties of the DBMS and the search algorithm are exploited for optimization of the data handling. In particular, results of previous computations are re-used, and I/O activity is reduced by keeping a small hot-set of binary tables in main-memory. As test results show, this system handles large datasets at a competitive performance.

Additional Metadata
ACM	DATA STORAGE REPRESENTATIONS (acm E.2), Systems (acm H.2.4), Information Search and Retrieval (acm H.3.3), Learning (acm I.2.6)
Publisher	CWI
Series	Department of Computer Science [CS]
Organisation	Databases
Citation APA APA Style APA-ALL Style AAA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Kersten, M.& Holsheimer, M. (1995). On the symbiosis of a data mining environment and a DBMS. In Department of Computer Science [CS] (R 9521). CWI.

Free Full Text ( Final Version , 204kb )

On the symbiosis of a data mining environment and a DBMS

Publication

Publication

Address

CWI researchers

Questions or comments?

On the symbiosis of a data mining environment and a DBMS

Publication

Publication

Workflow

Workflow

Add Content