Master theses

Can Query Engines Collaborate in Handling Distributed Linked Data Social Networks?

Promotors: Ruben Taelman

Main contact: Ruben Taelman

Problem

In distributed social networks, data may be stored in multiple locations not controlled by a single service provider. Additionally, this data may be linked to external sources. Due to this decentralized structure—and considering that maintaining a global index would be both a privacy risk and a maintenance challenge, the preferred query paradigm is Link Traversal Query Processing (LTQP). LTQP involves executing queries across distributed data sources by recursively expanding an internal dataset through the dereferencing of linked data discovered in the network. However, this process can be slow and lacks a guarantee of completeness when querying the open web. Several approaches have been proposed to address these limitations, including:

Restricting the search domain Optimizing source selection to improve traversal efficiency Applying heuristics for query planning Introducing structured environments to facilitate querying These approaches typically assume that each query is executed by a single engine on behalf of a single user. However, in social networks, multiple users may issue similar or even identical queries. If query engines could collaborate, they could:

Explore a larger portion of the search domain in less time Share relevant information to optimize query execution This thesis investigates how query engines can collaborate to improve query execution time and result completeness in the context of LTQP.