06472 Executive Summary - XQuery Implementation Paradigms
SQL has been developed as a query language specifically tailored for the relational data model in which all data instances take a regular, tabular shape. Despite this regularity, the efficient translation, optimization, and execution of SQL proved to be a challenging and complex task and thus has kept the database research community as well as industry busy, from the very first days of SQL (the late 70s) until today. The now pervasive XML data model, on the other hand, describes irregular, tree-shaped data. XQuery has been designed as a query language in which such data trees--among other supported data types--are first-class citizens: XQuery features tree (or node) constructors and embeds XPath as an elaborate tree traversal language. As you read this, XQuery has entered the final stages to become an official Recommendation of the World Wide Web Consortium (W3C) and all odds are that XQuery will play the role in high-volume XML data management that SQL assumed for the relational data model. From a computing science perspective, XQuery is a language with many interesting facets. Among these are fully orthogonal syntax, support for nested iteration, variable bindings (the XQuery FLWOR blocks) and recursive user-defined functions, trees (and nodes with identity) as first-class citizens, construction of arbitrary XML fragments, embedded XPath sub-language, order awareness (document and sequence order), and adoption of the expressive XML Schema type system and, in particular, support for a variant of regular expression types and, lastly, the ability to process well-formed as well as valid XML documents. These language characteristics pose significant challenges for fundamental research as well as for any endeavor to efficiently and completely implement XQuery. Current research prototypes as well as industrial implementations quickly reach their limits, especially if huge XML data volumes are to be processed. It is remarkable that the community has found a number of promising - although, regrettably, basically disjoint - avenues to approach these challenges. A number of formal representations and associated compilation techniques for XQuery have been devised, all stressing different aspects of the very same language. The major goals of this Dagstuhl seminar are to assemble a vivid group of researchers and professionals following the so-called "native", "relational", and "streaming" implementation paradigms, willing to teach each other in a constructive fashion, to create a situation and atmosphere in which the formerly separate camps collaboratively work on the syntheses of proven XQuery compilation and evaluation techniques, to make progress on new implementation techniques, with a close eye on the "real" XQuery semantics (all too often, research tends to ignore "rough edges" for the sake of simplicity and elegance), to reformulate and extend the important XMark benchmark, the de-facto XQuery benchmark suite that is used today to measure and compare the efficiency and completeness of XQuery processors (XMark needs a major overhaul in face of the rapid development of the XQuery language specification; XML updates will be considered as well). We firmly believe that the research as well as the industrial communities can significantly benefit from such bundling of XQuery implementation expertise.