We address the problem of deriving meaningful semantic index information for a multi-media database using a semi-structured docu-ment model. We show how our framework, called {em feature grammars, can be used to (1)~exploit third-party interpretation modules for real-world unstructured components, and (2)~use context-free grammars to convert such poorly or unstructured input to semi-structured output. The basic idea is to enrich context-free grammars with special symbols called detectors, which provide for the necessary structure {em just-in-time to satisfy a parser look-ahead. A prototype implementation has been constructed in the Acoi project to demonstrate the feasibility of this approach for indexing both images and audio documents.

Information Systems [INS]
Database Architectures

Schmidt, A.R, Windhouwer, M.A, & Kersten, M.L. (1999). Indexing real-world data using semi-structured documents. Information Systems [INS]. CWI.