Towards a Large Corpus of Richly Annotated Web Tables for Knowledge Base Population

Titelaufnahme

Zugriffsbeschränkung

Links

Dateien

Klassifikation

Abstract

Web Table Understanding in the context of Knowledge Base Population and the Semantic Web is the task of i) linking the content of tables retrieved from the Web to an RDF knowledge base, ii) of building hypotheses about the tables' structures and contents, iii) of extracting novel information from these tables, and iv) of adding this new information to a knowledge base. Knowledge Base Population has gained more and more interest in the last years due to the increased demand in large knowledge graphs which became relevant for Artificial Intelligence applications such as Question Answering and Semantic Search.
In this paper we describe a set of basic tasks which are relevant for Web Table Understanding in the mentioned context.
These tasks incrementally enrich a table with hypotheses about the table's content. In doing so, in the case of multiple interpretations, selecting one interpretation and thus deciding against other interpretations is avoided as much as possible. By postponing these decision, we enable table understanding approaches to decide by themselves, thus increasing the usability of the annotated table data.
We present statistics from analyzing and annotating 1.000.000 tables from the Web Table Corpus 2015 and make this dataset as well as our code available online.

Statistik