Multi-relational data mining

Knobbe, Arno; Blockeel, H.; Siebes, Arno; van der Wallen, D.M.G.

An important aspect of data mining algorithms and systems is that they should scale well to large databases. A consequence of this is that most data mining tools are based on machine learning algorithms that work on data in attribute-value format. Experience has proven that such 'single-table' mining algorithms indeed scale well. The downside of this format is, however, that more complex patterns are simply not expressible in this format and, thus, cannot be discovered. One way to enlarge the expressiveness is to generalize, as in ILP, from one-table mining to multiple table mining, i.e., to support mining on full relational databases. The key step in such a generalization is to ensure that the search space does not explode and that efficiency and, thus, scalability are maintained. In this paper we present a framework and an architecture that provide such a generalization. In this framework the semantic information in the database schema, e.g., foreign keys, are exploited to prune the search space and, in the architecture, database primitives are defined to ensure efficiency. Moreover, the framework induces a canonical generalization of algorithms, i.e., if the generalized algorithms are run on a single table database, they give the same results as their single-table counterparts. The framework is illustrated by the Warmr algorithm, which is a multi-relational generalization of the Apriori algorithm.

Additional Metadata
ACM	Learning (acm I.2.6), Database Applications (acm H.2.8), Logical Design (acm H.2.1)
MSC	Learning and adaptive systems (msc 68T05), Database theory (msc 68P15)
THEME	Information (theme 2)
Publisher	CWI
Series	Information Systems [INS]
Organisation	Database Architectures
Citation APA APA Style APA-ALL Style AAA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Knobbe, A., Blockeel, H., Siebes, A.& van der Wallen, D. M. G. (1999). Multi-relational data mining. In Information Systems [INS] (R 9908). CWI.

Free Full Text ( Final Version , 63kb )

Multi-relational data mining

Publication

Publication

Address

CWI researchers

Questions or comments?

Multi-relational data mining

Publication

Publication

Workflow

Workflow

Add Content