Skip to main content
Log in

Reliability and effectiveness of clickthrough data for automatic image annotation

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Automatic image annotation using supervised learning is performed by concept classifiers trained on labelled example images. This work proposes the use of clickthrough data collected from search logs as a source for the automatic generation of concept training data, thus avoiding the expensive manual annotation effort. We investigate and evaluate this approach using a collection of 97,628 photographic images. The results indicate that the contribution of search log based training data is positive despite their inherent noise; in particular, the combination of manual and automatically generated training data outperforms the use of manual data alone. It is therefore possible to use clickthrough data to perform large-scale image annotation with little manual annotation effort or, depending on performance, using only the automatically generated training data. An extensive presentation of the experimental results and the accompanying data can be accessed at http://olympus.ee.auth.gr/~diou/civr2009/.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. http://www-nlpir.nist.gov/projects/trecvid/

  2. http://images.google.com/imagelabeler/

  3. http://www.youtube.com/

  4. http://www.flickr.com/

  5. http://www.pascal-network.org/challenges/VOC/

  6. http://dbappl.cs.utwente.nl/pftijah/

  7. http://www.belga.be/

  8. http://vitalas.ercim.org/

  9. http://www.iptc.org/

References

  1. Ashman H, Antunovic M, Donner C, Frith R, Rebelos E, Schmakeit JF, Smith G, Truran M (2009) Are clickthroughs useful for image labelling? In: Pasi G, Bordogna G, Mauri G, Baeza-Yates R (eds) Proceedings of the 2009 IEEE/WIC/ACM international conference on web intelligence (WI 2009), pp 191–197

  2. Ayache S, Quénot G (2008) Video corpus annotation using active learning. In: Boughanem M, Berrut C, Mothe J, Soulé-Dupuy C (eds) Proceedings of the 30th European conference on IR research, pp 187–198

  3. Baeza-Yates RA, Hurtado CA, Mendoza M (2007) Improving search engines by query clustering. J Am Soc Inf Sci Technol 58(12):1793–1804

    Article  Google Scholar 

  4. Chang CC, Lin CJ (2001) Libsvm: a library for support vector machines. Available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

  5. Chang SF, He J, Jiang YG, El Khoury E, Ngo CW, Yanagawa A, Zavesky E (2008) Columbia University/VIREO-CityU/IRIT TRECVID2008 high-level feature extraction and interactive video search. In: Proceedings of TRECVID 2008

  6. Chua TS, Tang J, Hong R, Li H, Luo H, Zheng YT (2009) NUS-WIDE: A real-world Web image database from National University of Singapore. In: Marchand-Maillet S, Kompatsiaris Y (eds) Proceedings of the 8th international conference on content-based image and video retrieval (CIVR 2009). ACM Press

  7. Craswell N, Szummer M (2007) Random walks on the click graph. In: Proceedings of the 30th ACM SIGIR conference on research and development in information retrieval, pp 239–246

  8. Faymonville P, Wang K, Miller J, Belongie SJ (2009) CAPTCHA-based image labeling on the Soylent Grid. In: Bennett PN, Chandrasekar R, Chickering M, Ipeirotis PG, Law E, Mityagin A, Provost FJ, von Ahn L (eds) (2009) Proceedings of the ACM SIGKDD workshop on human computation. ACM Press, pp 46–49

  9. Fox S, Karnawat K, Mydland M, Dumais ST, White T (2005) Evaluating implicit measures to improve web search. ACM Trans Inf Syst 23(2):147–168

    Article  Google Scholar 

  10. van Gemert JC, Geusebroek JM, Veenman CJ, Snoek CGM, Smeulders AWM (2006) Robust scene categorization by learning image statistics in context. In: International workshop on semantic learning applications in multimedia, p 105

  11. Hauptmann A, Yan R, Lin WH (2007) How many high-level concepts will fill the semantic gap in news video retrieval? In: Sebe N, Worring M (eds) Proceedings of the 6th international conference on content-based image and video retrieval (CIVR 2007). ACM Press, pp 627–634

  12. Hiemstra D (1998) A linguistically motivated probabilistic model of information retrieval. In: Proceedings of the 2nd European conference on research and advanced technology for digital libraries (ECDL 1998), pp 569–584

  13. Hiemstra D, Rode H, van Os R, Flokstra J (2006) PF/Tijah: text search in an XML database system. In: Proceedings of the 2nd international workshop on open source information retrieval (OSIR 2006), pp 12–17

  14. Ho CJ, Chang TH, Lee JC, Hsu JYJ, Chen KT (2009) KissKissBan: a competitive human computation game for image annotation. In: Bennett PN, Chandrasekar R, Chickering M, Ipeirotis PG, Law E, Mityagin A, Provost FJ, von Ahn L (eds) (2009) Proceedings of the ACM SIGKDD workshop on human computation. ACM Press, pp 11–14

  15. Joachims T (2002) Optimizing search engines using clickthrough data. In: Proceedings of the 8th annual international ACM SIGKDD conference on knowledge discovery and data mining, pp 133–142

  16. Joachims T, Granka L, Pan B, Hembrooke H, Radlinski F, Gay G (2007) Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search. ACM Trans Inf Syst 25(2). doi:10.1145/1229179.1229181

    Google Scholar 

  17. Joachims T, Li H, Liu TY, Zhai C (2007) Learning to rank for information retrieval (lr4ir 2007). SIGIR Forum 41(2):58–62

    Article  Google Scholar 

  18. Kelly D, Teevan J (2003) Implicit feedback for inferring user preference: a bibliography. SIGIR Forum 37(2):18–28

    Article  Google Scholar 

  19. Li X, Snoek CGM (2009) Visual categorization with negative examples for free. In: Gao W, Rui Y, Hanjalic A, Xu C, Steinbach EG, El-Saddik A, Zhou MX (eds) Proceedings of the 17th international conference on multimedia. ACM Press, pp 661–664

  20. LSCOM Lexicon definitions and annotations version 1.0. Tech. rep., Columbia University (2006)

  21. Macdonald C, Ounis I (2009) Usefulness of quality click-through data for training. In: Craswell N, Jones R, Dupret G, Viegas E (eds) Proceedings of the 2009 workshop on Web search click data (WSCD 2009). ACM, New York, pp 75–79

    Chapter  Google Scholar 

  22. Morrison D, Marchand-Maillet S, Bruno E (2009) TagCaptcha: annotating images with CAPTCHAs. In: Bennett PN, Chandrasekar R, Chickering M, Ipeirotis PG, Law E, Mityagin A, Provost FJ, von Ahn L (eds) Proceedings of the ACM SIGKDD workshop on human computation. ACM Press, pp 44–45

  23. Palomino MA, Oakes MP, Wuytack T (2009) Automatic extraction of keywords for a multimedia search engine using the chi-square test. In: Proceedings of the 9th Dutch–Belgian information retrieval workshop (DIR 2009), pp 3–10

  24. Poblete B, Baeza-Yates RA (2008) Query-sets: using implicit feedback and query patterns to organize Web documents. In: Huai J, Chen R, Hon HW, Liu Y, Ma WY, Tomkins A, Zhang X (eds) Proceedings of the 17th international conference on World Wide Web, pp 41–50

  25. Scholer F, Shokouhi M, Billerbeck B, Turpin A (2008) Using clicks as implicit judgments: expectations versus observations. In: Boughanem M, Berrut C, Mothe J, Soulé-Dupuy C (eds) Proceedings of the 30th European conference on IR research, pp 28–39

  26. Setz AT, Snoek CGM (2009) Can social tagged images aid concept-based video search? In: Proceedings of the IEEE international conference on multimedia & expo (ICME 2009), pp 1460–1463

  27. Sigurbjörnsson B, van Zwol R (2008) Flickr tag recommendation based on collective knowledge. In: Huai J, Chen R, Hon HW, Liu Y, Ma WY, Tomkins A, Zhang X (eds) Proceedings of the 17th international conference on World Wide Web, pp 327–336

  28. Smith G, Ashman H (2009) Evaluating implicit judgements from image search interactions. In: Proceedings of the Web science conference: society on-line (WebSci 2009)

  29. Snoek CGM, Worring M, van Gemert JC, Geusebroek JM, Smeulders AWM (2004) The challenge problem for automated detection of 101 semantic concepts in multimedia. In: Proceedings of the 14th ACM international conference on multimedia, pp 421–430

  30. Tsikrika T, Diou C, de Vries AP, Delopoulos A (2009) Image annotation using clickthrough data. In: Marchand-Maillet S, Kompatsiaris Y (eds) Proceedings of the 8th international conference on content-based image and video retrieval (CIVR 2009). ACM Press

  31. Ulges A, Koch M, Schulze C, Breuel T (2008) Learning TRECVID’08 high-level features from YouTubeTM. In: Proceedings of TRECVID 2008

  32. Ulges A, Schulze C, Keysers D, Breuel TM (2008) Identifying relevant frames in weakly labeled videos for training concept detectors. In: Luo J, Guan L, Hanjalic A, Kankanhalli MS, Lee I (eds) Proceedings of the 7th international conference on content-based image and video retrieval (CIVR 2008). ACM Press, pp 9–16

  33. Ulges A, Schulze C, Keysers D, Breuel TM (2008) A system that learns to tag videos by watching YouTube. In: Gasteratos A, Vincze M, Tsotsos JK (eds) Proceedings of the 6th international conference of computer vision systems (ICVS 2008). Lecture Notes in Computer Science, vol 5008. Springer, pp 415–424

  34. von Ahn L, Blum M, Langford J (2004) Telling humans and computers apart automatically. Commun ACM 47(2):56–60

    Article  Google Scholar 

  35. von Ahn L, Dabbish L (2004) Labeling images with a computer game. In: Proceedings of the ACM SIGCHI conference on human factors in computing systems (CHI 2004). ACM Press, pp 319–326

  36. von Ahn L, Dabbish L (2008) Designing games with a purpose. Commun ACM 51(8):58–67

    Google Scholar 

  37. von Ahn L, Liu R, Blum M (2006) Peekaboom: a game for locating objects in images. In: Grinter RE, Rodden T, Aoki PM, Cutrell E, Jeffries R, Olson GM (eds) Proceedings of the ACM SIGCHI conference on human factors in computing systems (CHI 2006). ACM Press, pp 55–64

  38. von Ahn L, Maurer B, Mcmillen C, Abraham D, Blum M (2008) reCAPTCHA: Human-based character recognition via web security measures. Science 321(5895):1465–1468

    Article  MathSciNet  Google Scholar 

  39. Yang J, Hauptmann AG (2008) (Un)Reliability of video concept detection. In: Luo J, Guan L, Hanjalic A, Kankanhalli MS, Lee I (eds) Proceedings of the 7th international conference on content-based image and video retrieval (CIVR 2008). ACM Press, pp 85–94

Download references

Acknowledgements

The authors are grateful to the Belga press agency for providing the images and search logs used in this work and to Marco Palomino from the University of Sunderland for the extraction of the text features used. This work was supported by the EU-funded VITALAS project (FP6-045389). Christos Diou is supported by the Greek State Scholarships Foundation (http://www.iky.gr).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Theodora Tsikrika.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tsikrika, T., Diou, C., de Vries, A.P. et al. Reliability and effectiveness of clickthrough data for automatic image annotation. Multimed Tools Appl 55, 27–52 (2011). https://doi.org/10.1007/s11042-010-0584-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-010-0584-1

Keywords

Navigation