Julián's trip report for ISWC 2024
You have probably heard that ISWC was not a very good conference and I am afraid that most of these comments and opinions may be well founded. Unfortunately the conference’s organization was not the best, with very basic mistakes such as not making sure that the venue’s rooms were properly reserved to cover the whole conference (in their defense, organizing a conference is not easy). However, most importantly the content of the conference was heavily dominated by LLM-related papers that are not necessarily all that interesting for our research. But what was most concerning is that some of these papers may actually be of low quality, including basic mistakes on machine learning-related aspects, as expressed by some of our colleagues from IDLAB PREDiCT, who are actual experts on the machine learning domain. One explanation for this, might be that the typical reviewer base of ISWC may not have the expertise nor the knowledge to properly judge the technical aspects of these papers, allowing low quality work to slip through.
Despite all of these, I want to share with you a few interesting resources I found in the conference:
LLMs and KGs
I know, I just said that many of these may not be very good, however LLMs are here to stay and we need to know how to integrate our research with/next to them. So here are a couple of papers that I still found interesting:
- CyKG-RAG: Towards knowledge-graph enhanced retrieval augmented generation for cybersecurity – This paper presents an approach on how to query and extract information from a KG through an LLM. Not sure this is the best setup, but interesting to check nonetheless.
- Knowledge Graph-Enhanced Retrieval Augmented Generation for E-Commerce - A paper by eBay researchers where they use an LLM to extract product’s attributes from their natural language description and then perform entity matching against a KG hosted in their graph DB called NuGraph. An interesting aspect of this paper is that they show how this type of system could be evaluated.
- Leveraging Large Language Models to Identify Event-Driven Changes in Wikidata Entities – In this paper they show EventKG a KG containing event information captured from Wikidata and use an LLM to provide natural language explanations of the cause and of an event and potential causal relations to previous events.
KG querying
This was definitely the highest point for me in this conference. The work presented by prof. Domagoj Vrgoč from Chile’s Pontificia Universidad Catolica, both through a tutorial on “Recent advances in Graph Data Management” and the paper titled “PathFinder: Returning Paths in Graph Queries”, which was awarded with the Best (student) paper of the research track.
The tutorial started with an introduction to what graph data models and graph databases are (RDF and Property Graphs) and their differences. Later it provided a very interesting explanation on the theory and definition of traditional graph join algorithms and also on worst case optimal join algorithms. In particular the Leapfrog-Trie join algorithm could be very interesting to explore as an alternative in Comunica. See the slides for more details.
The paper presented an efficient approach for solving Regular Path Queries (as in SPARQL Property Path queries), while extracting the actual paths in the graph. A long missing feature missed in SPARQL. Their approach was implemented in their own graph database called Millenium DB, which supports both SPARQL and GQL.
Others
An interesting resource was KGHeartBeat, a tool for monitoring online KGs (as in their SPARQL endpoints), while providing different quality metrics analyses. Perhaps a resource we can extend and use for LDES and Solid resources too.
Concluding remarks
Although the conference was not as good as expected, networking and meeting with other researchers was still very valuable. I believe that our research will continue to be appreciated and welcome, especially now with the overload of LLM-related works.