Rule discovery: tough, not meaningless

Struzik, Z.R.

`Model free' rule discovery from data has recently been subject to considerable criticism, which has cast a shadow over the emerging discipline of time series data mining. However, other than in data mining, rule discovery has long been the subject of research in statistical physics of complex phenomena. Drawing from the expertise acquired therein, we suggest explanations for the two mechanisms of the apparent `meaninglessness' of rule recovery in the reference data mining approach. One reflects the universal property of self-affinity of signals from real life complex phenomena. It further expands on the issue of scaling invariance and fractal geometry, explaining that for ideal scale invariant (fractal) signals, rule discovery requires more than just comparing two parts of the signal. Authentic rule discovery is likely to look for the possible `structure' pertinent to the failure mechanism of the (position and/or resolution-wise) invariance of the time series analysed. The other reflects the redundancy of the `trivial' matches, which effectively smoothes out the rule which potentially could be discovered. Orthogonal scale space representations and appropriate redundancy suppression measures over autocorrelation operations performed during the matches are suggested as the methods of choice for rule discovery.

Additional Metadata
ACM	MODELS AND PRINCIPLES (acm H.1), PATTERN RECOGNITION (acm I.5), MISCELLANEOUS (acm J.m), PHYSICAL SCIENCES AND ENGINEERING (acm J.2), DATA STORAGE REPRESENTATIONS (acm E.2)
MSC	Fractals (msc 28A80), Pattern recognition, speech recognition (msc 68T10), Searching and sorting (msc 68P10)
THEME	Information (theme 2)
Publisher	CWI
Series	Information Systems [INS]
Organisation	Database Architectures
Citation APA APA Style APA-ALL Style AAA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Struzik, Z. R. (2003). Rule discovery: tough, not meaningless. In Information Systems [INS] (R 0304). CWI.

Free Full Text ( Final Version , 5mb )

Rule discovery: tough, not meaningless

Publication

Publication

Address

CWI researchers

Questions or comments?

Rule discovery: tough, not meaningless

Publication

Publication

Workflow

Workflow

Add Content