Efficient Distribution of Full-Fledged XQuery
We investigate techniques to automatically decompose any XQuery query into subqueries, that can be executed near their data sources; i.e., function-shipping. In this scenario, the subqueries being executed remotely may have XML node-valued parameters or results, that must be shipped in some way. The main challenge addressed here is to ensure that the decomposed queries properly respect XML node identity and preserve structural properties, when (parts of) XML nodes are sent over the network, effectively copying them. We start by precisely characterising the conditions, under which pass-by-value parameter passing causes semantic differences between remote execution of an XQuery expression and its local execution. We then formulate a conservative strategy that effectively avoids decomposition in such cases. To broaden the possibilities of query distribution, we extend the pass-by-value semantics to a pass-by-fragment semantics, which keeps better track of node identities and structural properties. The pass-by-fragment semantics is subsequently refined to a pass-by-projection semantics by means of a novel runtime XML projection technique, which safely eliminates most semantic differences between the local and remote execution of an XQuery expression, and strongly reduces message sizes. The proposed techniques are implemented in XRPC, a simple yet efficient XQuery extension that enables function-shipping by adding a Remote Procedure Call mechanism to XQuery. Experiments on MonetDB/XQuery establish the performance potential of our XQuery decomposition techniques.