DMIR on Microblog Track 2011

Li, W.; de Vries, Arjen; Eickhoff, Carsten

In this paper we present our work on Microblog Track of TREC 2011. We tried two methods to tackle the problem of tweets retrieval, namely EMAX and RTB. The first method EMAX is mainly based on the intuition that not only should retrieved tweets contain the keywords in given queries but also provide more information. This results in a ranking method based on self-information. Our second method RTB tries to incorporate the importance of recency along with relevance in microblog retrieval tasks. Therefore, we adapt portfolio theory to balance the relevance dimension and re- cency dimension. However, the evaluation results suggest no significant improvement from both two methods because of the short lengths of documents, the noisy and spam tweets and the re-ordering in recency. Meanwhile, we also present some ideas during the course of participation. By close examining the judgments, we find that most of relevant documents are those containing a link to external resource and have a length of around 17 words, which is different from the collection statistics.

Additional Metadata
THEME	Information (theme 2)
Conference	Text REtrieval Conference
Organisation	Human-Centered Data Analytics
Citation APA APA Style APA-ALL Style AAA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Li, W., de Vries, A.& Eickhoff, C. (2011, November). DMIR on Microblog Track 2011. Proceedings of Text REtrieval Conference 2011 (20).

DMIR on Microblog Track 2011

Publication

Publication

Address

CWI researchers

Questions or comments?

DMIR on Microblog Track 2011

Publication

Publication

Workflow

Workflow

Add Content