Stefan Bischof · Semantic Technologies & Industrial AI

Complementary Methods for the Enrichment of Linked Data

Stefan Bischof

Published in: Vienna University of Technology, Austria

2017

PhD thesis

Abstract

Data published in accordance with Semantic Web standards and Linked Data principles constitutes a prime source of openly available data ready for analysis in a unified format. Even though the main use case of Semantic Web technologies is data integration, in practice getting comparable data is not trivial, that is heterogeneity problems and challenges arising through incomplete data prevail despite syntactic homogeneity. The use case we focus on in this thesis revolves around comparing statistical data about cities found on the Web in (semi-) structured form integrated as Linked Data. Firstly, we evaluate different data sources and eventually integrate suitable datasets using Semantic Web technologies and RDF. Hereby, the work specifically addresses the challenges of getting complete data from SPARQL endpoints, for instance with respect to OWL entailment regimes. However, we come to the conclusion that OWL inference alone is insufficient for resolving incompleteness and heterogeneity problems, especially for numerical data. To this end, we develop methods to infer missing numerical data exploiting statistical methods and equational knowledge. Lastly, we discuss combinations of these methods, i.e. we develop a combined approach for integrating rule-based and statistical methods for Linked data enrichment.