Query optimizers that explore a search space exhaustively using transformation rules usually apply all possible rules on each alternative, and stop when no new information is produced. A memoizing structure was proposed in [McKenna93] to improve the re-use of common subexpression, thus improving the efficiency of the search considerably. However, a question that remained open is, what is the complexity of the transformation-based enumeration process? In particular, with n the number of relations, does it achieve the O(3^n) lower bound established by [OL90]? In this paper we examine the problem of duplicates, in transformation-based enumeration. In general, different sequences of transformation rules may end up deriving the same element, and the optimizer must detect and discard these duplicate elements generated by multiple paths. We show that the usual commutativity/associativity rules for joins generate O(4^n) duplicate operators. We then propose a scheme ---within the generic transformation-based framework--- to avoid the generation of duplicates, which does achieve the O(3^n) lower bound on join enumeration. Our experiments show an improvement of up to a factor of 5 in the optimization of a query with 8 tables, when duplicates are avoided rather than detected.

Very Large Data Base Endowment.
International Conference on Very Large Databases
Database Architectures

Pellenkoft, J., Galindo-Legaria, C., & Kersten, M. (1997). The Complexity of Transformation-Based Join Enumeration. In Proceedings of International Conference on Very Large Databases 1997 (VLDB 0). Very Large Data Base Endowment.