Master theses

Indexing personal data pods using automated data shape generation

Keywords: automated generation, data pod, indexing, shapes

Supervision: Pieter Colpaert

Students: max 1

To combat the centralization of your data in large hubs such as Facebook and Google, data pods provide an approach to store personal data in a self-governed environment. However, this creates a decentralized network of data pods, making querying of large amounts of data pods more difficult. To speed up this querying process, we can make use of indexing metadata present in the data pod, and predict the locations of relevant data using available data shapes in the pod.

In this thesis, we want to look at the possibility of using automated shape generation technologies to add and index shape information of newly added data files, in order to speed up the querying process. Concretely, you start with researching current technologies for automated shape generation. Upon completing this literature study, the goal is to create an implementation that automatically tries to generate data shapes on new data being added to the pod. Finally, you try to incorporate these generated data shapes into existing indexing structures present in the data pod, and create a small demonstration of the work done.

Your task is to:

  • Research technologies enabling automated shape generation.
  • Automatically generate data shapes for files added to a personal data pod.
  • Add generated shape information to existing indexing structures.
  • Possibly: Generate an overview over the data pod using present shape information.