Master theses

Building a W3C specification for querying time series on Web-scale

Promotors: Pieter Colpaert

Main contact: Pieter Colpaert

Problem

Time-series data is central in mobility, environmental sensing, and IoT, but representing each observation as a separate RDF subgraph repeats the same contextual triples (sensor, observed property, feature of interest, etc.), causing rapid graph growth and degraded performance for common workloads like temporal window queries. In SPARQL engines, this manifests as excessive I/O and expensive joins when scanning large spans of observations.

RDF Time Series Snippets (RDF TSS) proposes a lossless compaction approach: group contiguous observations into snippets and encode the per-point time/value pairs inside a single RDF literal (JSON-typed), while keeping shared context once per snippet (plus an optional JSON-LD context to enable expansion back to standard RDF observations). This drastically reduces triple counts (reported ~97–99% fewer triples) and makes window retrieval much more stable and faster across multiple triple stores, but introduces a trade-off: fine-grained point-level querying is no longer natively expressible in pure SPARQL without extra processing or extensions.

Goal

You will turn RDF TSS from a paper concept into a reusable, interoperable building block by:

  1. Evolving the RDF specification
  2. Building a reference implementation (e.g., library + CLI) that can:
  • compact standard observation-style RDF into RDF TSS at configurable temporal granularities (daily…yearly),
  • optionally expand snippets back to observation RDF using the JSON-LD context.
  • answer complex time series questions
  1. Creating a compliance and benchmarking framework so others can prove interoperability and quantify performance (graph size, query success rate, end-to-end window retrieval time including client-side extraction). The paper’s evaluation pattern (vary dataset size, temporal range, snippet granularity; run across multiple triple stores) can be used as a starting point and strengthened into a reproducible benchmark suite.