Combining distributional semantics and structured data to study lexical change
Statistical Natural Language Processing (NLP) techniques allow to quantify lexical semantic change using large text corpora. Wordlevel results of these methods can be hard to analyse in the context of sets of semantically or linguistically related words. On the other hand, structured knowledge sources represent semantic relationships explicitly, but ignore the problem of semantic change. We aim to address these limitations by combining the statistical and symbolic approach: we enrich WordNet, a structured lexical database, with quantitative lexical change scores provided by HistWords, a dataset produced by distributional NLP methods. We publish the result as Linked Open Data and demonstrate how queries on the combined dataset can provide new insights.
|Keywords||Knowledge bases, Lexical semantics, Linked open data, NLP|
|Series||Lecture Notes in Computer Science|
|Project||A Europe-wide Interoperable Virtual Research Environment to Empower Multidisciplinary Research Communities and Accelerate Innovation and Collaboration|
|Conference||International Conference on Knowledge Engineering and Knowledge Management|
|Grant||This work was funded by the European Commission 7th Framework Programme; grant id h2020/676247 - A Europe-wide Interoperable Virtual Research Environment to Empower Multidisciplinary Research Communities and Accelerate Innovation and Collaboration (VRE4EIC)|
van Aggelen, A.E, Hollink, L, & van Ossenbruggen, J.R. (2016). Combining distributional semantics and structured data to study lexical change. In Lecture Notes in Computer Science/Lecture Notes in Artificial Intelligence (pp. 40–49). doi:10.1007/978-3-319-58694-6_4