Publications

Journal Articles

T. Iggena, E. Bin Ilyas, M. Fischer, R. Tönjes, T. Elsaleh, R. Rezvani, N. Pourshahrokhi, S. Bischof, A. Fernbach, J. Xavier Parreira, P. Schneider, P. Smirnov, M. Strohbach, H. Truong, A. González-Vidal, A. F. Skarmeta, P. Singh, M. J. Beliatis, M. Presser, J. A. Martinez, P. Gonzalez-Gil, M. Krogbæk, S. Holmgård Christophersen IoTCrawler: Challenges and Solutions for Searching the Internet of Things. Sensors. 2021. [ DOI | Details ]

Abstract

Due to the rapid development of the Internet of Things (IoT) and consequently, the availability of more and more IoT data sources, mechanisms for searching and integrating IoT data sources become essential to leverage all relevant data for improving processes and services. This paper presents the IoT search framework IoTCrawler. The IoTCrawler framework is not only another IoT framework, it is a system of systems which connects existing solutions to offer interoperability and to overcome data fragmentation. In addition to its domain-independent design, IoTCrawler features a layered approach, offering solutions for crawling, indexing and searching IoT data sources, while ensuring privacy and security, adaptivity and reliability. The concept is proven by addressing a list of requirements defined for searching the IoT and an extensive evaluation. In addition, real world use cases showcase the applicability of the framework and provide examples of how it can be instantiated for new scenarios.

Stefan Bischof, Andreas Harth, Benedikt Kämpgen, Axel Polleres, Patrik Schneider Enriching Integrated Statistical Open City Data by Combining Equational Knowledge and Missing Value Imputation. Journal of Web Semantics: Special Issue on Semantic Statistics. 2017. [ DOI | Preprint | Details ]

Abstract

Several institutions collect statistical data about cities, regions, and countries for various purposes. Yet, while access to high quality and recent such data is both crucial for decision makers and a means for achieving transparency to the public, all too often such collections of data remain isolated and not re-usable, let alone comparable or properly integrated. In this paper we present the Open City Data Pipeline, a focused attempt to collect, integrate, and enrich statistical data collected at city level worldwide, and re-publish the resulting dataset in a re-usable manner as Linked Data. The main features of the Open City Data Pipeline are: (i) we integrate and cleanse data from several sources in a modular and extensible, always up-to-date fashion; (ii) we use both Machine Learning techniques and reasoning over equational background knowledge to enrich the data by imputing missing values, (iii) we assess the estimated accuracy of such imputations per indicator. Additionally, (iv) we make the integrated and enriched data, including links to external data sources, such as DBpedia, available both in a web browser interface and as machine-readable Linked Data, using standard vocabularies such as QB and PROV. Apart from providing a contribution to the growing collection of data available as Linked Data, our enrichment process for missing values also contributes a novel methodology for combining rule-based inference about equational knowledge with inferences obtained from statistical Machine Learning approaches. While most existing works about inference in Linked Data have focused on ontological reasoning in RDFS and OWL, we believe that these complementary methods and particularly their combination could be fruitfully applied also in many other domains for integrating Statistical Linked Data, independent from our concrete use case of integrating city data.

Stefan Bischof, Stefan Decker, Thomas Krennwallner, Nuno Lopes, Axel Polleres Mapping between RDF and XML with XSPARQL. Journal on Data Semantics. 2012. [ DOI | PDF | Details ]

Abstract

One promise of Semantic Web applications is to seamlessly deal with heterogeneous data. The Extensible Markup Language (XML) has become widely adopted as an almost ubiquitous interchange format for data, along with transformation languages like XSLT and XQuery to translate data from one XML format into another. However, the more recent Resource Description Framework (RDF) has become another popular standard for data representation and exchange, supported by its own query language SPARQL, that enables extraction and transformation of RDF data. Being able to work with XML and RDF using a common framework eliminates several unnecessary steps that are currently required when handling both formats side by side. In this paper we present the XSPARQL language that, by combining XQuery and SPARQL, allows to query XML and RDF data using the same framework and transform data from one format into the other. We focus on the semantics of this combined language and present an implementation, including discussion of query optimisations along with benchmark evaluation.

Conference Articles

Stefan Bischof, Erwin Filtz, Josiane Xavier Parreira Semantic Smart Readiness Indicator Framework. SEMANTiCS. 2024. [ DOI | Ontology | Zenodo | Details ]

Abstract

The Smart Readiness Indicator (SRI) is an energy rating scheme targeted at buildings to evaluate their capacity to integrate and benefit from smart technologies for enhanced energy efficiency and overall performance. Existing tools for SRI assessment and rating do not provide a standard format for data exchange. However, there are several scenarios in which a FAIR, standardised data format is beneficial, such as data exchange between building tools, comparison of different assessments, or computing statistics about buildings.

Stefan Bischof, Erwin Filtz, Josiane Xavier Parreira, Simon Steyskal LLM-based Guided Generation of Ontology Term Definitions. ESWC 2024 Industry Track. 2024. [ DOI | PDF | Details ]

Abstract

This paper describes our approach for leveraging LLMs to generate definitions and descriptions for ontology terms. Our approach is grounded in the need for detailed and accurate representation of (domain-specific) Knowledge Graphs, and it aims at speeding up the process of generating such text. We outline our approach, including the problems that we encountered, and the solution we propose to overcome them. Our approach is currently in use in an industrial setting.

Stefan Bischof, Gottfried Schenner Rail Topology Ontology: A Rail Infrastructure Base Ontology. ISWC 2021 Resource Track. 2021. [ arXiv preprint | Video | DOI | ontology | Details ]

Abstract

Engineering projects for railway infrastructure typically involve many subsystems which need consistent views of the planned and built infrastructure and its underlying topology. Consistency is typically ensured by exchanging and verifying data between tools using XML-based data formats and UML-based object-oriented models. A tighter alignment of these data representations via a common topology model could decrease the development effort of railway infrastructure engineering tools. A common semantic model is also a prerequisite for the successful adoption of railway knowledge graphs. Based on the RailTopoModel standard, we developed the Rail Topology Ontology as a model to represent core features of railway infrastructures in a standard-compliant manner. This paper describes the ontology and its development method, and discusses its suitability for integrating data of railway engineering systems and other sources in a knowledge graph. With the Rail Topology Ontology, software engineers and knowledge scientists have a standard-based ontology for representing railway topologies to integrate disconnected data sources. We use the Rail Topology Ontology for our rail knowledge graph and plan to extend it by rail infrastructure ontologies derived from existing data exchange standards, since many such standards use the same base model as the presented ontology, viz., RailTopoModel.

Stefan Bischof, Gottfried Schenner Challenges of Constructing a Railway Knowledge Graph. ESWC 2019 Industry Track. 2019. [ DOI | Video | PDF | Details ]

Abstract

International railway networks are an important means for passenger and freight transport. Railway software systems often must be supported and maintained for decades. Different systems cover different aspects such as train protection, signalling, infrastructure hardware and software. Usage of different standards and regulations, both international (e.g., European Train Control System) and national, for these aspects leads to a large number of incompatible systems.

Dan Puiu, Stefan Bischof, Bogdan Serbanescu, Septimiu Nechifor, Josiane Parreira, Herwig Schreiner A public transportation journey planner enabled by IoT data analytics. Conference on Innovations in Clouds, Internet and Networks. 2017. [ DOI | PDF | Details ]

Abstract

The scope of the paper is to present the application developed for Brașov public transportation company. The application provides route recommendations and incident notifications for the citizens who travel by bus. This is achieved by processing in real time the data streams about bus arrivals in stations and the incidents reported by citizens. The application was developed on top of the CityPulse framework.

Stefan Bischof, Christoph Martin, Axel Polleres, Patrik Schneider Collecting, Integrating, Enriching and Republishing Open City Data as Linked Data. ISWC 2015. 2015. [ DOI | Video | PDF | Details ]

Abstract

Access to high quality and recent data is crucial both for decision makers in cities as well as for the public. Likewise, infrastructure providers could offer more tailored solutions to cities based on such data. However, even though there are many data sets containing relevant indicators about cities available as open data, it is cumbersome to integrate and analyze them, since the collection is still a manual process and the sources are not connected to each other upfront. Further, disjoint indicators and cities across the available data sources lead to a large proportion of missing values when integrating these sources. In this paper we present a platform for collecting, integrating, and enriching open data about cities in a reusable and comparable manner: we have integrated various open data sources and present approaches for predicting missing values, where we use standard regression methods in combination with principal component analysis (PCA) to improve quality and amount of predicted values. Since indicators and cities only have partial overlaps across data sets, we particularly focus on predicting indicator values across data sets, where we extend, adapt, and evaluate our prediction model for this particular purpose: as a "side product" we learn ontology mappings (simple equations and sub-properties) for pairs of indicators from different data sets. Finally, we republish the integrated and predicted values as linked open data.

Stefan Bischof, Markus Krötzsch, Axel Polleres, Sebastian Rudolph Schema-Agnostic Query Rewriting for OWL QL. Workshop on Description Logics. 2015. [ CEUR | Details ]

Abstract

SPARQL 1.1 is powerful enough to "implement" a full-fledged OWL QL reasoner in a single query. This paper summarises our results on schema-agnostic query rewriting in SPARQL 1.1, which supports frequent updates of both data and schema. The rewriting system does not need any information on the content of the database under query, while the SPARQL processor that executes the query does not need any support for OWL. This is particularly interesting if a database can only be accessed through a restricted SPARQL query interface that does not support reasoning.

Stefan Bischof, Markus Krötzsch, Axel Polleres, Sebastian Rudolph Schema-Agnostic Query Rewriting in SPARQL 1.1. ISWC 2014. 2014. [ DOI | PDF | Details ]

Abstract

SPARQL 1.1 supports the use of ontologies to enrich query results with logical entailments, and OWL 2 provides a dedicated fragment OWL QL for this purpose. Typical implementations use the OWL QL schema to rewrite a conjunctive query into an equivalent set of queries, to be answered against the non-schema part of the data. With the adoption of the recent SPARQL 1.1 standard, however, RDF databases are capable of answering much more expressive queries directly, and we ask how this can be exploited in query rewriting. We find that SPARQL 1.1 is powerful enough to "implement" a full-fledged OWL QL reasoner in a single query. Using additional SPARQL 1.1 features, we develop a new method of schema-agnostic query rewriting, where arbitrary conjunctive queries over OWL QL are rewritten into equivalent SPARQL 1.1 queries in a way that is fully independent of the actual schema. This allows us to query RDF data under OWL QL entailment without extracting or preprocessing OWL axioms.

Axel Polleres, Stefan Bischof, Herwig Schreiner City Data Pipeline - A report about experiences from using Open Data to gather indicators of city performance. European Data Forum. 2014. [ Video | Slides | Details ]

Stefan Bischof, Axel Polleres RDFS with Attribute Equations via SPARQL Rewriting. ESWC 2013. 2013. [ DOI | Video | Slides | PDF | Details ]

Abstract

In addition to taxonomic knowledge about concepts and properties typically expressible in languages such as RDFS and OWL, implicit information in an RDF graph may be likewise determined by arithmetic equations. The main use case here is exploiting knowledge about functional dependencies among numerical attributes expressible by means of such equations. While some of this knowledge can be encoded in rule extensions to ontology languages, we provide an arguably more flexible framework that treats attribute equations as first class citizens in the ontology language. The combination of ontological reasoning and attribute equations is realized by extending query rewriting techniques already successfully applied for ontology languages such as (the DL-Lite-fragment of) RDFS or OWL, respectively. We deploy this technique for rewriting SPARQL queries and discuss the feasibility of alternative implementations, such as rule-based approaches.

Aidan Boran, Ivan Bedini, Christopher J. Matheus, Peter F. Patel-Schneider, Stefan Bischof An Empirical Analysis of Semantic Techniques Applied to a Network Management Classification Problem. IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology. 2012. [ DOI | Details ]

Abstract

Semantic technologies are increasingly being employed to integrate, relate and classify heterogeneous data from various problem domains. To date, however, little empirical analysis has been carried out to help identify the benefits and limitations of different semantic approaches on specific data integration and classification problems. This paper evaluates three alternative semantic techniques for performing classification over data derived from the telecommunications domain. The problem of interest involves inferring the "health" status of network nodes (femtocells) from synthesized performance management (PM) instance data based on the operational PM schema. The semantic approaches used in the comparison include OWL2 axioms, SPARQL queries and SWRL rules. Empirical tests were performed across a range of data set sizes, using Pellet for axioms and rules and ARQ for queries. The experimental results provide (mostly) quantitative and (some) qualitative indication of the relative merits of each approach. Key among these findings is confirmation of the clear superiority of queries over rules and axioms in terms of raw performance and scalability.

Stefan Bischof Optimising XML-RDF Data Integration. Extended Semantic Web Conference. 2012. [ DOI | PDF | Details ]

Abstract

The Semantic Web provides a wealth of open data in RDF format. XML remains a widespread format for data exchange. When combining data of these two formats several problems arise due to representational incompatibilities. The query language XSPARQL, which is built by combining XQuery and SPARQL, addresses some of these problems. However the evaluation of complex XSPARQL queries by a naive implementation shows slow response times. Establishing an integrated formal model for a core fragment of XSPARQL will allow us to improve performance of query answering by defining query equivalences.

Stefan Bischof, Nuno Lopes, Axel Polleres Improve Efficiency of Mapping Data between XML and RDF with XSPARQL. Web Reasoning and Rule Systems. 2011. [ DOI | PDF | Details ]

Abstract

XSPARQL is a language to transform data between the tree-based XML format and the graph-based RDF format. XML is a widely adopted data exchange format which brings its own query language XQuery along. RDF is the standard data format of the Semantic Web with SPARQL being the corresponding query language. XSPARQL combines XQuery and SPARQL to a unified query language which provides a more intuitive and maintainable way to translate data between the two data formats. A naive implementation of XSPARQL can be inefficient when evaluating nested queries. However, such queries occur often in practice when dealing with XML data. We present and compare several approaches to optimise nested queries. By implementing these optimisations we improve efficiency up to two orders of magnitude in a practical evaluation.

Nuno Lopes, Stefan Bischof, Stefan Decker, Axel Polleres On the Semantics of Heterogeneous Querying of Relational, XML and RDF Data with XSPARQL. Portuguese Conference on Artificial Intelligence. 2011. [ http | Details ]

Abstract

XSPARQL is a transformation and query language that caters for heterogenous sources: in its present status it is possible to transform data between XML and RDF formats due to the integration of the XQuery and SPARQL query languages. In this paper we propose an extension of the XSPARQL language to incorporate data contained in relational databases by integrating a subset of SQL in the syntax of XSPARQL. Exposing data contained in relational databases as RDF is a necessary step towards the realisation of the Semantic Web and Web of Data. We present the syntax of an extension of the XSPARQL language catering for the inclusion of the SQL query language along with the semantics based on the XQuery formal semantics and sketch how this extended XSPARQL language can be used to expose RDB2RDF mappings, as currently being discussed in the W3C RDB2RDF Working Group.

Workshop and Poster Articles

Stefan Bischof, Erwin Filtz, Josiane Xavier Parreira, Florian Rötzer, Simon Steyskal, Stephan Strommer On A Semantic Model and Knowledge Graph Based Approach to Enable Transparency, Explainability, and Auditability for High-Pressure Die-Casting. SENTIS Workshop. 2025. [ CEUR | Details ]

Abstract

This paper addresses the critical challenge of fragmented data and knowledge in high-pressure die-casting environments, where the lack of integrated information hampers effective troubleshooting and compliance with emerging transparency requirements. We developed a comprehensive semantic model that integrates distributed data sources and expert knowledge into a unified knowledge graph framework, explicitly connecting manufacturing processes, failures, metrics, and countermeasures through formalized semantic relationships. Our implementation shows how the resulting architecture successfully transforms traditionally siloed industrial data into an interconnected knowledge representation that distinguishes between specified expert knowledge and actual operational data, enabling systematic reasoning about cause-effect relationships throughout the manufacturing process. The approach provides significant value by enhancing manufacturing transparency and decision support while aligning with Industry 5.0 principles and emerging regulatory frameworks for explainable industrial systems, ultimately supporting more sustainable and efficient manufacturing processes.

Robert David, Stefan Bischof, Konrad Diwold, Josiane Xavier Parreira Symbolic-AI driven Data Repairs for Large Scale Energy Co-Simulations: Combining SHACL repairs and Datalog rules to detect, explain, and correct errors in large scale energy co-simulation setups. Posters and Demos Track of SEMANTiCS. 2025. [ CEUR | Details ]

Abstract

The transformation of energy distribution systems is fostering new models, like renewable energy communities, which require complex, simulation-based feasibility assessments. Preparing these simulations is often labor-intensive and error-prone due to heterogeneous actors and location-specific grid topologies. This paper proposes a symbolic AI approach that combines SHACL (repairs) and Datalog (imputation) to semi-automatically detect, explain, and correct inconsistencies for grid and sensor data so it can serve as input for co-simulations. Applied within the DataBri-X project and tested using Siemens BIFROST, the approach demonstrates promising improvements in data quality and preprocessing efficiency.

Stefan Bischof, Andreas Falkner, Erwin Filtz, Patrik Schneider, Simon Steyskal, Mihaela Topa CONTO: An Ontology-based Approach for Interoperable Configuration Knowledge. Posters and Demos Track of SEMANTiCS. 2025. [ CEUR | Details ]

Abstract

CONTO (CONfiguration ONTOlogy and TOols) addresses the challenges of vendor lock-in and costly evaluations in product configuration systems through an ontology-based semantic framework that establishes interoperability without reinventing existing solver technologies. Our approach provides dual representations--instance-based Configuration Vocabulary and OWL DL-based formalism--with tooling to transform product models into programs for various platforms. By decoupling modelling from configuration processing, CONTO creates an abstraction layer that preserves existing investments while enabling integration with emerging AI tools. Our implementation shows practical applications supporting both commercial and open-source configurators.

Stefan Bischof, Gottfried Schenner On Linking Heterogeneous Railway Knowledge Graphs: Challenges in Integrating ERA and OpenStreetMap Rail Infrastructure Representations. Sem4Tra Workshop. 2025. [ CEUR | PDF | Details ]

Abstract

In the last years railway infrastructure data has become available via public SPARQL endpoints. The ontologies/vocabularies used to represent the topology part of the railway data are mainly based either on the Open Street Map (OSM) data model or on the UML-based RailTopoModel. In this paper we discuss some of the challenges of integrating railway infrastructure data especially topological data. As an example, we show how to link the data available for the Austrian railway network using Open Street Map data via the QLever SPARQL endpoint and data from the ERA Knowledge Graph.

Stefan Bischof, Erwin Filtz, Josiane Xavier Parreira, Simon Steyskal, Michael Baumgart, David Gruber, Maximilian Liebetreu, Florian Rötzer, Stephan Strommer Towards SHACL-based Knowledge Graph Transformation of Visual Domain Knowledge. Posters and Demos Track of SEMANTiCS. 2024. [ CEUR | Details ]

Abstract

Effective knowledge representation plays a pivotal role in harnessing the full potential of domain-specific information. Through tools like Infinity Maps, domain knowledge can be easily captured in a visual manner. However, translating these visually intuitive representations to formal, machine-processable formats often necessitates expert knowledge, thereby creating a significant barrier between domain experts and knowledge engineers. While domain experts possess deep understanding of their respective domains, they often lack the formalisation skills required to transform this knowledge into machine-readable formats. Conversely, knowledge engineers can design and implement sophisticated knowledge graphs, but may not have access to the domain-specific expertise necessary for effective knowledge representation. To address this challenge, we propose a novel approach that leverages SHACL (Shape Constraint Language) rules to transform visual domain knowledge expressed as Infinity Maps into knowledge graphs. Our method enables domain experts to define their knowledge structures using familiar Infinity Map representations, which are then transformed into standardised knowledge graphs compliant with the SHACL standard.

Stefan Bischof, Gottfried Schenner Towards a Railway Topology Ontology to Integrate and Query Rail Data Silos. Posters and Demos Track of ISWC. 2020. [ CEUR | Video | Details ]

Abstract

Engineering projects in the railway domain typically involve a large number of subsystems. Therefore a common understanding of the domain is essential. In the past this has been provided by XML-based data exchange standards and UML-based object-oriented models. With the increasing adoption of semantic technologies for engineering projects the demand to provide these standard data models in the form of ontologies has grown. We describe requirements and challenges to define an open standard ontology for railway topologies based on existing standards. The purpose of the finished ontology will be to enable topological queries and reasoning for railway networks in a standardized and reusable manner.

Stefan Bischof, Gottfried Schenner, Simon Steyskal, Richard Taupe Integrating Semantic Web Technologies and ASP for Product Configuration. Config Workshop. 2018. [ CEUR | Details ]

Abstract

Currently there is no single dominating technology for building product configurator systems. While research often focuses on a single technology/paradigm, building an industrial-scale product configurator system will almost always require the combination of various technologies for different aspects of the system (knowledge representation, reasoning, solving, user interface, etc.) This paper demonstrates how to build such a hybrid system and how to integrate various technologies and leverage their respective strengths. Due to the increasing popularity of the industrial knowledge graph we utilize Semantic Web technologies (RDF, OWL and the Shapes Constraint Language (SHACL)) for knowledge representation and integrate it with Answer Set Programming (ASP), a well-established solving paradigm for product configuration.

Mónica Posada-Sánchez, Stefan Bischof, Axel Polleres Extracting Geo-Semantics About Cities From OpenStreetMap. Posters and Demos Track of SEMANTiCS. 2016. [ CEUR | Details ]

Abstract

Access to high quality and updated data is crucial to assess and contextualize city state of affairs. The City Data Pipeline uses diverse Open Data sources to integrate statistical information about cities. The resulting incomplete dataset is not directly usable for data analysis. We exploit data from a geographic information system, namely OpenStreetMap, to obtain new indicators for cities with better coverage. We show that OpenStreetMap is a promising data source for statistical data about cities.

Stefan Bischof Improving Practical Reasoning on top of SPARQL. Doctoral Consortium of the International Conference on Web Reasoning and Rule Systems. 2015. [ PDF | Details ]

Abstract

Reasoning techniques are not well received by the developer community. One reason is the cost of providing sparql endpoints with enabled reasoning. Another reason is the missing support for reasoning on numbers, which is needed for tasks such as data analytics. An important aspect of the second problem is the high number of missing values that inherently occur when integrating data on a global scale. In this work we propose two approaches to improving the situation for both problems by first developing the necessary techniques as well as prototypical applications based on query rewriting and a practical evaluation.

Stefan Bischof, Christoph Martin, Axel Polleres, Patrik Schneider Open City Data Pipeline: Collecting, Integrating, and Predicting Open City Data. Workshop on Knowledge Discovery and Data Mining Meets Linked Open Data co-located with Extended Semantic Web Conference. 2015. [ CEUR | Details ]

Abstract

Having access to high quality and recent data is crucial both for decision makers in cities as well as for informing the public, likewise, infrastructure providers could offer more tailored solutions to cities based on such data. However, even though there are many data sets containing relevant indicators about cities available as open data, it is cumbersome to integrate and analyze them, since the collection is still a manual process and the sources are not connected to each other upfront. Further, disjoint indicators and cities across the available data sources lead to a large proportion of missing values when integrating these sources. In this paper we present a platform for collecting, integrating, and enriching open data about cities in a re-usable and comparable manner: we have integrated various open data sources and present approaches for predicting missing values, where we use standard regression methods in combination with principal component analysis to improve quality and amount of predicted values. Further, we re-publish the integrated and predicted values as linked open data.

Stefan Bischof, Athanasios Karapantelakis, Cosmin-Septimiu Nechifor, Amit Sheth, Alessandra Mileo, Payam Barnaghi Semantic Modelling of Smart City Data. W3C Workshop on the Web of Things. 2014. [ PDF | Details ]

Daniele Dell'Aglio, Axel Polleres, Nuno Lopes, Stefan Bischof Querying the Web of Data with XSPARQL 1.1. ISWC2014 Developers Workshop. 2014. [ CEUR | Details ]

Abstract

On the Web and in corporate environments there exists a lot of data in various formats. XQuery and XSLT serve as query and transformation languages for XML. But as RDF also becomes a mainstream format for Web of data, transformations languages between these formats are required. XSPARQL is a hybrid language that provides an integration framework for XML, RDF, but also JSON and relational data by partially combining several languages such as XQuery, SPARQL and SQL. In this paper, we present the latest open source release of the XSPARQL engine, which is based on standard software components (Jena and Saxon) and outline possible applications of XSPARQL 1.1 to address Web data integration use cases.

Gottfried Schenner, Stefan Bischof, Axel Polleres, Simon Steyskal Integrating Distributed Configurations with RDFS and SPARQL. Config Workshop. 2014. [ CEUR | Details ]

Abstract

Large interconnected technical systems (e.g. railway networks, power grid, computer networks) are typically configured with the help of multiple configurators, which store their configurations in separate databases based on heterogeneous domain models (ontologies). In practice users often want to ask queries over several distributed configurations. In order to reason over these distributed configurations in a uniform manner a mechanism for ontology alignment and data integration is required. In this paper we describe our experience with using standard Semantic Web technologies (RDFS and SPARQL) for data integration and reasoning.

Stefan Bischof, Axel Polleres, Simon Sperl City Data Pipeline - A System for Making Open Data Useful for Cities. I-SEMANTICS Posters & Demonstrations Track. 2013. [ CEUR | Details ]

Abstract

Some cities publish data in an open form. But even more cities can profit from the data that is already available as open or linked data. Unfortunately open data of different sources is usually given also in different heterogeneous data formats. With the City Data Pipeline we aim to integrate data about cities in a common data model by using Semantic Web technologies. Eventually we want to support city officials with their decisions by providing automated analytics support.

Nuno Lopes, Stefan Bischof, Orri Erling, Axel Polleres, Alexandre Passant, Diego Berrueta, Antonio Campos, Jérôme Euzenat, Kingsley Idehen, Stefan Decker, Stéphane Corlosquet, Jacek Kopecky, Janne Saarela, Thomas Krennwallner, Davide Palmisano, Michal Zaremba RDF and XML: Towards a Unified Query Layer. W3C Workshop on RDF Next Steps. 2010. [ PDF | Details ]

Abstract

One of the requirements of current Semantic Web applications is to deal with heterogeneous data. The Resource Description Framework (RDF) is the W3C recommended standard for data representation, yet data represented and stored using the Extensible Markup Language (XML) is almost ubiquitous and remains the standard for data exchange. While RDF has a standard XML representation, XML Query languages are of limited use for transformations between natively stored RDF data and XML. Being able to work with both XML and RDF data using a common framework would be a great advantage and eliminate unnecessary intermediate steps that are currently used when handling both formats.

Patents

Lukas Krammer, Stefan Bischof, Andreas Fernbach, Josiane Xavier Parreira Computer-implemented Method for Applying New Software-controlled Services to a Building Control System. 2023. Patent:EP4446825A1.[ Espacenet | EPO Register | Details ]

Josiane Xavier Parreira, Stefan Bischof, Andreas Fernbach, Lukas Krammer Method for Automatically Computing the Automation Capabilities of a Building. 2023. Patent:EP4439201A1.[ Espacenet | Espacenet (US) | EPO Register | Details ]

Stefan Bischof, Lukas Krammer, Daniel Lechner, Josef Wechselauer Building Automation Device and Method. 2019. Patent:EP3637204A1.[ Espacenet | Details ]

Axel Polleres, Stefan Bischof Computer implemented method for integrating data from the Web from different sources by using SPARQL. 2013. Patent:EP2787453A1.[ Espacenet | EPO Register | Details ]

Theses

Stefan Bischof Complementary Methods for the Enrichment of Linked Data. Vienna University of Technology, Austria. 2017. (PhD thesis). [ DOI | PDF | Rigorosum slides | Details ]

Abstract

Data published in accordance with Semantic Web standards and Linked Data principles constitutes a prime source of openly available data ready for analysis in a unified format. Even though the main use case of Semantic Web technologies is data integration, in practice getting comparable data is not trivial, that is heterogeneity problems and challenges arising through incomplete data prevail despite syntactic homogeneity. The use case we focus on in this thesis revolves around comparing statistical data about cities found on the Web in (semi-) structured form integrated as Linked Data. Firstly, we evaluate different data sources and eventually integrate suitable datasets using Semantic Web technologies and RDF. Hereby, the work specifically addresses the challenges of getting complete data from SPARQL endpoints, for instance with respect to OWL entailment regimes. However, we come to the conclusion that OWL inference alone is insufficient for resolving incompleteness and heterogeneity problems, especially for numerical data. To this end, we develop methods to infer missing numerical data exploiting statistical methods and equational knowledge. Lastly, we discuss combinations of these methods, i.e. we develop a combined approach for integrating rule-based and statistical methods for Linked data enrichment.

Stefan Bischof Implementation and Optimisation of Queries in XSPARQL. Vienna University of Technology, Austria. 2010. (Master's thesis). [ metadata | PDF | Details ]

Abstract

XSPARQL is a language for transforming data between XML and RDF. XML is a widely used format for data exchange. RDF is a data format based on directed graphs, primarily used to represent Semantic Web data. XSPARQL is built by combining the strengths of the two corresponding query languages XQuery for XML, and SPARQL for RDF. In this thesis we present two XSPARQL enhancements called Constructed Dataset and Dataset Scoping, the XDEP dependent join optimisation, and a new XSPARQL implementation. Constructed Dataset allows to create and query intermediary RDF graphs. The Dataset Scoping enhancement provides an optional fix for unintended results which may occur when evaluating complex XSPARQL queries containing nested SPARQL query parts. The XSPARQL implementation works by first rewriting an XSPARQL query to XQuery expressions containing interleaved calls to a SPARQL engine for processing RDF data. The resulting query is then evaluated by standard XQuery and SPARQL engines. The dependent join optimisation XDEP is designed to reduce query evaluation time for queries demanding repeated evaluation of embedded SPARQL query parts. XDEP minimises the number of interactions between the XQuery and SPARQL engines by bundling similar queries and let the XQuery engine select relevant data on its own. We did an experimental evaluation of our approach using an adapted version of the XQuery benchmark suite XMark. We will show that the XDEP optimisation reduces the evaluation time of all compatible benchmark queries. Using this optimisation we could evaluate certain XSPARQL queries by two orders of magnitude faster than with unoptimised XSPARQL.

Other

Stefan Bischof, Benedikt Kämpgen, Andreas Harth, Axel Polleres, Patrik Schneider Open City Data Pipeline. Working Papers on Information Systems, Information Business and Operations 01/2017, Department für Informationsverarbeitung und Prozessmanagement, WU Vienna University of Economics and Business, Vienna, February 2017. 2017. [ http | Details ]

Abstract

Statistical data about cities, regions and at country level is collected for various purposes and from various institutions. Yet, while access to high quality and recent such data is crucial both for decision makers as well as for the public, all to often such collections of data remain isolated and not re-usable, let alone properly integrated. In this paper we present the Open City Data Pipeline, a focused attempt to collect, integrate, and enrich statistical data collected at city level worldwide, and republish this data in a reusable manner as Linked Data. The main feature of the Open City Data Pipeline are: (i) we integrate and cleanse data from several sources in a modular and extensible, always up-to-date fashion; (ii) we use both Machine Learning techniques as well as ontological reasoning over equational background knowledge to enrich the data by imputing missing values, (iii) we assess the estimated accuracy of such imputations per indicator. Additionally, (iv) we make the integrated and enriched data available both in a we browser interface and as machine-readable Linked Data, using standard vocabularies such as QB and PROV, and linking to e.g. DBpedia. Lastly, in an exhaustive evaluation of our approach, we compare our enrichment and cleansing techniques to a preliminary version of the Open City Data Pipeline presented at ISWC2015: firstly, we demonstrate that the combination of equational knowledge and standard machine learning techniques significantly helps to improve the quality of our missing value imputations; secondly, we arguable show that the more data we integrate, the more reliable our predictions become. Hence, over time, the Open City Data Pipeline shall provide a sustainable effort to serve Linked Data about cities in increasing quality.

Stefan Bischof, Markus Krötzsch, Axel Polleres, Sebastian Rudolph Schema-Agnostic Query Rewriting in SPARQL 1.1: Technical report. 2014. [ PDF | Details ]

About this list

This page lists all publications by Stefan Bischof, grouped by type. For more information, see the homepage.