2010-10-01
Entity Ranking using Wikipedia as a Pivot
Publication
Publication
Presented at the
ACM Conference on Information and Knowledge Management , Toronto, Ontario, Canada
In this paper we investigate the task of Entity Ranking on the Web.
Searchers looking for entities are arguably better served by presenting
a ranked list of entities directly, rather than a list of web pages
with relevant but also potentially redundant information about these
entities. Since entities are represented by their web homepages, a
naive approach to entity ranking is to use standard text retrieval.
Our experimental results clearly demonstrate that text retrieval is
effective at finding relevant pages, but performs poorly at finding
entities. Our proposal is to use Wikipedia as a pivot for finding entities
on theWeb, allowing us to reduce the hard web entity ranking
problem to easier problem of Wikipedia entity ranking. Wikipedia
allows us to properly identify entities and some of their characteristics,
and Wikipedia’s elaborate category structure allows us to get
a handle on the entity’s type.
Our main findings are the following. Our first finding is that, in
principle, the problem of web entity ranking can be reduced to Wikipedia
entity ranking. We found that the majority of entity ranking
topics can be answered using Wikipedia, and that with high precision
relevant web entities corresponding to the Wikipedia entities
can be found using Wikipedia’s “external links.” Our second finding
is that we can exploit the structure of Wikipedia to improve entity
ranking effectiveness. Entity types are valuable retrieval cues in
Wikipedia. Automatically assigned entity types are effective, and
almost as good as manually assigned types. Our third finding is that
web entity retrieval can be significantly improved by using Wikipedia
as a pivot. Both Wikipedia’s external links and the enriched
Wikipedia entities with additional links to homepages are significantly
better at finding primary web homepages than anchor text
retrieval, which in turn significantly improved over standard text retrieval.
Additional Metadata | |
---|---|
ACM | |
ACM Conference on Information and Knowledge Management | |
Organisation | Human-Centered Data Analytics |
Kaptein, R., Serdyukov, P., de Vries, A., & Kamps, J. (2010). Entity Ranking using Wikipedia as a Pivot. In Proceedings of the 19th ACM international conference on Information and knowledge management (pp. 69–78). ACM. |
Additional Files | |
---|---|
Publisher Version |
See Also |
---|
inProceedings
|
inProceedings
|