Master theses

Context-aware data processing for Digital Heritage

Keywords: RDF, ontext-based data processing, Linked Open Data, data technology, digital heritage, policies, reasoning, web technology

Promotors: Pieter Colpaert, Ruben Dedecker

Students: max 1

Problem

In the heritage world, digitization creates a wealth of information that is shared with the world to discover information from the past for current and future generations. These digitization efforts create vast amounts of information, published by institutions all around the world.

As digital heritage information from different institutions is published, inconsistencies are bound to appear in the available data. From differences in granularity (born in Belgium <-> Ghent, in the year 1845 <-> September 7th 1845), to contradictions in information.

Goal

In the digital heritage and many other use cases, contextual prioritisation of information forms a difficult challenge. With your work, we want to assess the feasibility using policies to process data from different sources while prioritising on specific contexts.

In the first part of the thesis work, an analysis will be performed of the different ways RDF can be used to represent contextual information. This will take the form of a literature study, after which a comparative analysis will be done of how the different context-adding methods compare in terms of semantic meaning and the impact this has on the ability and the speed with which this data can be processed.

In the second part of the thesis work, a Proof Of Concept implementation will be developed. For this, test data will be made available from Digital Heritage Institutions in Flanders as well as from the Linked.Art project at Yale. Based on the findings from part one, a conversion will be made from the test data to the best-performing context-added RDF format that was decided on. Using this data, a policy framework will be designed that enables the context-aware processing of RDF data from different sources. A first iteration of this will be assigning a weighted value to all data sources, to define the priority in the processing of data from different data sources. In a second iteration, these weights will be dynamically generated based on the aspects of the data and the available data sources.

Insights obtained from this thesis are valuable both for the digital heritage institutions, as well as for broader research on context-aware source selection and data processing.