Tab2Know: Building a Knowledge Base from Tables in Scientific Papers

Kruit, Benno; He, Hongyu; Urbani, Jacopo

doi:10.1007/978-3-030-62419-4_20

B.B. Kruit (Benno), H. He (Hongyu) and J. Urbani (Jacopo)

2020

Tab2Know: Building a Knowledge Base from Tables in Scientific Papers

Presented at the 19th International Semantic Web Conference, Athens, Greece, November 2–6, 2020 (June 2020), Athens, Greece

Tables in scientific papers contain a wealth of valuable knowledge for the scientific enterprise. To help the many of us who frequently consult this type of knowledge, we present Tab2Know, a new end-to-end system to build a Knowledge Base (KB) from tables in scientific papers. Tab2Know addresses the challenge of automatically interpreting the tables in papers and of disambiguating the entities that they contain. To solve these problems, we propose a pipeline that employs both statistical-based classifiers and logic-based reasoning. First, our pipeline applies weakly supervised classifiers to recognize the type of tables and columns, with the help of a data labeling system and an ontology specifically designed for our purpose. Then, logic-based reasoning is used to link equivalent entities (via sameAs links) in different tables. An empirical evaluation of our approach using a corpus of papers in the Computer Science domain has returned satisfactory performance. This suggests that ours is a promising step to create a large-scale KB of scientific knowledge.

Additional Metadata
Persistent URL	doi.org/10.1007/978-3-030-62419-4_20
Series	Lecture Notes in Computer Science/Lecture Notes in Artificial Intelligence
Conference	19th International Semantic Web Conference, Athens, Greece, November 2–6, 2020
Organisation	Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands
Citation APA APA Style APA-ALL Style AAA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Kruit, B., He, H.& Urbani, J. (2020). Tab2Know: Building a Knowledge Base from Tables in Scientific Papers. International Semantic Web Conference, 12506.https://doi.org/10.1007/978-3-030-62419-4_20

View at Publisher

Free Full Text ( Final Version , 1mb )

See Also
dataset Tab2Know evaluation data B.B. Kruit (Benno), H. He (Hongyu) and J. Urbani (Jacopo)
software Tab2Know B.B. Kruit (Benno), H. He (Hongyu) and J. Urbani (Jacopo)

Tab2Know: Building a Knowledge Base from Tables in Scientific Papers

Publication

Publication

dataset
Tab2Know evaluation data

software
Tab2Know

Address

CWI researchers

Questions or comments?

Tab2Know: Building a Knowledge Base from Tables in Scientific Papers

Publication

Publication

dataset Tab2Know evaluation data

software Tab2Know

Workflow

Workflow

Add Content

dataset
Tab2Know evaluation data

software
Tab2Know