A statistically principled approach to histogram segmentation
This paper outlines a statistically principled approach to clustering one dimensional data. Given a dataset, the idea is to fit a density function that is as simple as possible, but still compatible with the data. Simplicity is measured in terms of a standard smoothness functional. Data-compatibility is given a precise meaning in terms of distribution-free statistics based on the empirical distribution function. The main advantages of this approach are that (i) it involves a single decision-parameter which has a clear statistical interpretation, and (ii) there is no need to make a priori assumptions about the number or shape of the clusters.