Master theses

SPARQL Update Propagation Through Views

Keywords: Linked Data, Querying, RDF, SPARQL, Webdevelopment, decentralization, update, views

Promotors: Ruben Verborgh, Ruben Taelman

Students: max 1

Problem

The SPARQL query language is the de facto standard for querying RDF data, typically using either SELECT queries (which return variable bindings) or CONSTRUCT queries (which generate new RDF data). CONSTRUCT queries can be used to create views, where a view is defined as a CONSTRUCT query over a set of materialized sources.

When considering RDF web interfaces, a single interface can be defined either as a canonical interface or as a view over other views and canonical interfaces. In many scenarios, users interact with these views and may want to update the view’s state by modifying the underlying data. For example, in an e-commerce platform, product catalogs are often displayed as views aggregating data from multiple sources, including product descriptions, pricing, and availability. A vendor or administrator may want to update a view by adjusting the promotional price or modifying the inventory levels of products. These changes need to be reflected in the underlying sources such as the manufacturer’s database or the inventory system.

The challenge lies in ensuring that updates made on the view (e.g., modifying product details in the catalog) are correctly translated into updates on the underlying data sources. This process must be seamless and efficient while preserving the integrity of the data and respecting any access control restrictions imposed by the sources. Some updates may be restricted by the underlying data sources due to access control or data ownership policies. Therefore, a key part of this research will focus on developing algorithms that allow the query engine to propagate updates from views to underlying sources, ensuring both consistency and correctness.

Additionally, the research could explore situations where the sources may require extra context to process updates. For instance, when a product’s price is updated in the view, the query engine may need to provide additional context (e.g., promotional discounts or stock data) to the underlying data source to process the update correctly. The goal is to ensure that necessary data is available when required and that updates are handled efficiently across distributed systems.

Goal

This thesis will focus on designing algorithms to translate updates from RDF views to their underlying sources, ensuring accurate and efficient update propagation. Specifically, the student will:

  1. Develop algorithms that translate updates on RDF views into corresponding updates on the underlying sources.
  2. Investigate how the query engine can handle context data that the underlying sources may need to process updates.
  3. Implement the translation algorithms.
  4. Evaluate the performance and correctness of the algorithms in real-world scenarios to ensure that updates are handled efficiently and as expected.

By enabling seamless updates from RDF views to their underlying sources, this research will make it easier to manage complex views, ultimately enhancing the usability and performance of query engines. Additionally, this work is crucial to ongoing research describing the relation between writes and reads in RDF data interfaces.