Rule discovery: tough, not meaningless
`Model free' rule discovery from data has recently been subject to considerable criticism, which has cast a shadow over the emerging discipline of time series data mining. However, other than in data mining, rule discovery has long been the subject of research in statistical physics of complex phenomena. Drawing from the expertise acquired therein, we suggest explanations for the two mechanisms of the apparent `meaninglessness' of rule recovery in the reference data mining approach. One reflects the universal property of self-affinity of signals from real life complex phenomena. It further expands on the issue of scaling invariance and fractal geometry, explaining that for ideal scale invariant (fractal) signals, rule discovery requires more than just comparing two parts of the signal. Authentic rule discovery is likely to look for the possible `structure' pertinent to the failure mechanism of the (position and/or resolution-wise) invariance of the time series analysed. The other reflects the redundancy of the `trivial' matches, which effectively smoothes out the rule which potentially could be discovered. Orthogonal scale space representations and appropriate redundancy suppression measures over autocorrelation operations performed during the matches are suggested as the methods of choice for rule discovery.
|MODELS AND PRINCIPLES (acm H.1), PATTERN RECOGNITION (acm I.5), MISCELLANEOUS (acm J.m), PHYSICAL SCIENCES AND ENGINEERING (acm J.2), DATA STORAGE REPRESENTATIONS (acm E.2)|
|Fractals (msc 28A80), Pattern recognition, speech recognition (msc 68T10), Searching and sorting (msc 68P10)|
|Information (theme 2)|
|Information Systems [INS]|
Struzik, Z.R. (2003). Rule discovery: tough, not meaningless. Information Systems [INS]. CWI.