Revisiting Inferential Benchmarks for Knowledge Graph Completion

Shuwen Liu; Bernardo Cuenca Grau; Ian Horrocks; Egor V. Kostylev

doi:10.24963/kr.2023/45

KR2023

Proceedings of the 20th International Conference on Principles of Knowledge Representation and Reasoning

Rhodes, Greece. September 2-8, 2023.

Edited by

ISSN: 2334-1033
ISBN: 978-1-956792-02-7

Revisiting Inferential Benchmarks for Knowledge Graph Completion

Shuwen Liu(University of Oxford)
Bernardo Cuenca Grau(University of Oxford)
Ian Horrocks(University of Oxford)
Egor V. Kostylev(University of Oslo)

PDF

BibTeX

https://doi.org/10.24963/kr.2023/45

Keywords

Benchmarks for KR systems
Development, deployment, and evaluation of KR systems to solve real-world problems
Applications that combine KR with machine learning
Applications of KR in semantic web, knowledge graphs

Abstract

Knowledge Graph (KG) completion is the problem of extending an incomplete KG with missing facts. A key feature of Machine Learning approaches for KG completion is their ability to learn inference patterns, so that the predicted facts are the results of applying these patterns to the KG. Standard completion benchmarks, however, are not well-suited for evaluating models' abilities to learn patterns, because the training and test sets of these benchmarks are a random split of a given KG and hence do not capture the causality of inference patterns. We propose a novel approach for designing KG completion benchmarks based on the following principles: there is a set of logical rules so that the missing facts are the results of the rules' application; the training set includes both premises matching rule antecedents and the corresponding conclusions; the test set consists of the results of applying the rules to the training set; the negative examples are designed to discourage the models from learning rules not entailed by the rule set. We use our methodology to generate several benchmarks and evaluate a wide range of existing KG completion systems. Our results provide novel insights on the ability of existing models to induce inference patterns from incomplete KGs.