The language modeling approach to retrieval is based on the philosophy that the language in a relevant document follows the same distribution as that in the query. This same philosophy can also be applied to content-based image and video retrieval, where the only difference lies in the definition of anguage'. Previous results on the TRECVID 2003 corpus have demonstrated that the visual content can be captured successfully by a continuous Gaussian Mixture Model. This paper investigates whether modeling the visual content by a discrete multinomial model (as used in full-text retrieval) is also viable. We compare the retrieval effectiveness obtained on the TRECVID 2003 corpus when using continuous vs. discrete keyframe models.

Piscataway, N.J.
IEEE International Conference on Image Processing
Database Architectures

de Vries, A., & Westerveld, T. (2004). A comparison of continuous vs. discrete image models for probabilistic image and video retrieval. In Proceedings of IEEE International Conference on Image Processing 2004 (ICIP 0). Piscataway, N.J.