Tab2Know evaluation data
Description
Evaluation data for the paper "Tab2Know: Building a Knowledge Base from Tables in Scientific Papers" published at ISWC2020.
For code, see https://github.com/karmaresearch/tab2know .
This resource contains the following files:
- `venues.txt`: The venues that were use for selecting PDFs from the [Semantic Scholar Open Research Corpus](http://s2-public-api-prod.us-west-2.elasticbeanstalk.com/corpus/) that were published in the last 5 years.
- `extracted-tables.tar.gz`: All tables that we extracted using [Tabula](https://github.com/tabulapdf/tabula) from these PDFs.
- `sample-400.tar.gz`: A sample of these tables which we used for annotation.
- `ontology.ttl`: The annotation ontology in Turtle format.
- `all_metadata.jsonl`: Annotations for this sample in the JSON format described below.
- `labelqueries.csv`: The label queries used for weak annotation, created using the annotation interface. This CSV file contains 6 columns: a numeric ID, the label query template name (`template`), the template slots (`slots`), the label type (`label`), the annotation value (`value`), and a toggle for the interface (`enabled`).
- `labelqueries-sparql-templates.zip`: The label query templates. These are SPARQL queries with slots of the form `{{slot}}`. The templates in `labelqueries.csv` refer to these files.
- `rules.txt`: Datalog rules that we used for entity resolution.
- `tab2know-graph.nt.gz`: The final RDF graph that contains all extracted table structures, predicted table and column classes, and resolved entity links.
Files
labelqueries-sparql-templates.zip
Files
(7.6 GB)
Name | Size | Download all |
---|---|---|
md5:7111ce13f51f048390024884ed0a1dff
|
389.0 kB | Download |
md5:0577aeefde307462b09ad3ab62efe01d
|
7.1 GB | Download |
md5:b691c5e80793c552ce8284d6ce16c971
|
9.6 kB | Preview Download |
md5:23cac8f13fc601893885aa55a966aa26
|
8.8 kB | Preview Download |
md5:c878e0ea0c625093a4850e13f11dcf0b
|
17.2 kB | Download |
md5:9e49b0272524b1fc7b077d871197ccf9
|
7.2 kB | Preview Download |
md5:36864076ab5fae44101beba90656b9a1
|
3.5 kB | Preview Download |
md5:5e9cca142ef7217bf627ec50a5f4559b
|
13.4 MB | Download |
md5:5937454b8bb5df1ccf636fc10d17e245
|
469.3 MB | Download |
md5:fe8348c0633f2299a1948a8b485dfe4f
|
145 Bytes | Preview Download |