Master theses

Optimization of Daisy-Chained SPARQL Queries

Promotors: Ruben Taelman

Main contact: Ruben Taelman

Problem

SPARQL is the de facto standard for querying RDF data, with SELECT and CONSTRUCT queries being commonly used. While SELECT queries return a table of variable bindings, CONSTRUCT queries generate new RDF data, effectively acting as views over the original dataset.

When multiple CONSTRUCT queries are chained together, they create intermediate datasets that can introduce inefficiencies. A naïve execution of such chains (A -- construct 1 -> B -- construct 2 -> C) leads to redundant computations and unnecessary data materialization, reducing performance. Despite the importance of optimizing non-materialized views in SPARQL, there is currently no systematic approach to algebraically optimizing these query chains into a single, more efficient construct query.

This inefficiency also impacts RDF-based data interfaces, where read interfaces can be described as views over enriched datasets derived from multiple write interfaces. Without optimization, the description and execution of these interfaces remain suboptimal.