Querying XML Documents Made Easy: Nearest Concept Queries

Schmidt, A.R.; Kersten, Martin; Windhouwer, Menzo

A.R. Schmidt, M.L. Kersten (Martin) and M.A. Windhouwer (Menzo)

2001

Querying XML Documents Made Easy: Nearest Concept Queries

Presented at the IEEE International Conference on Data Engineering, Heidelberg, Germany

Due to the ubiquity and popularity of XML, users often are in the following situation: they want to query XML documents which contain potentially interesting information but they are unaware of the mark-up structure that is used. For example, it is easy to guess the contents of an XML bibliography file whereas the mark-up depends on the methodological, cultural and personal background of the author(s). Nonetheless, it is this hierarchical structure that forms the basis of XML query languages. In this paper we exploit the tree structure of XML documents to equip users with a powerful tool, the meet operator, that lets them query databases with whose content they are familiar, but without requiring knowledge of tags and hierarchies. Our approach is based on computing the lowest common ancestor of nodes in the XML syntax tree: eg, given two strings, we are looking for nodes whose offspring contains these two strings. The novelty of this approach is that the result type is unknown at query formulation time and dependent on the database instance. If the two strings are an author's name and a year, mainly publications of the author in this year are returned. If the two strings are numbers the result mostly consists of publications that have the numbers as year or page numbers. Because the result type of a query is not specified by the user we refer to the lowest common ancestor as nearest concept We also present a running example taken from the bibliography domain, and demonstrate that the operator can be implemented efficiently.

Additional Metadata
THEME	Information (theme 2)
Publisher	IEEE
Conference	IEEE International Conference on Data Engineering
Organisation	Database Architectures
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Schmidt, A. R., Kersten, M., & Windhouwer, M. (2001). Querying XML Documents Made Easy: Nearest Concept Queries. In Proceedings of IEEE International Conference on Data Engineering 2001 (pp. 321–329). IEEE.

Querying XML Documents Made Easy: Nearest Concept Queries

Publication

Publication

Address

Publishing at CWI

Questions or comments?

Querying XML Documents Made Easy: Nearest Concept Queries

Publication

Publication

Workflow

Workflow

Add Content