Complexity of transformation-based optimizers and duplicate-free generation of alternatives
Transformation-based optimizers that explore a search space exhaustively usually apply all possible transformation rules on each alternative, and stop when no new information is produced. In general, different sequences of transformations may end up deriving the same element. The optimizer must detect and discard these duplicate elements. In this paper we consider two questions: How many duplicates are generated? And then, how can it be avoided? For the first question, our analysis shows that as queries get larger, the number of duplicates encountered is several times that of the new elements. And even for small queries, duplicates are generated more often than new elements. For the second question, we describe a technique to avoid generating duplicates, based on keeping track of (a summary of) the derivation history of each element in the search space.