This is partly an introductory survey paper to clustering and classification problems with particular emphasis on the classification of lists of key words and phrases from a given scientific domain such as mathematics. In addition the paper contains a number of new concepts and results; a number of open questions, and some as yet untried embryo clustering ideas. New are the idea of Urysohn distance (section 3), the idea of using Lipshitz distance (section 4), the universal lower bound in terms of Lipshitz distance for any fixed depth hierarchical classification scheme (section 8), the optimality of single link clustering with respect to Lipshitz distance (section 8); in addition there are new results on what I have started to call the Buneman tree of a metric space (section 9); also new are the ideas of third party support (section 11) and power set metrics (section 10).

, , ,
Department of Analysis, Algebra and Geometry [AM]

Hazewinkel, M. (1995). Classification in mathematics, discrete metric spaces, and approximation by trees. Department of Analysis, Algebra and Geometry [AM]. CWI.