KR2025Proceedings of the 22nd International Conference on Principles of Knowledge Representation and ReasoningProceedings of the 22nd International Conference on Principles of Knowledge Representation and Reasoning

Melbourne, Australia. November 11-17, 2025.

Edited by

ISSN: 2334-1033
ISBN: 978-1-956792-08-9

Sponsored by
Published by

Copyright © 2025 International Joint Conferences on Artificial Intelligence Organization

Advances in Logic-Based Entity Resolution: Enhancing ASPEN with Local Merges and Optimality Criteria

  1. Zhiliang Xiang(Cardiff University)
  2. Meghyn Bienvenu(LaBRI - CNRS & University of Bordeaux)
  3. Gianluca Cima(Sapienza University of Rome)
  4. Víctor Gutiérrez-Basulto(Cardiff University)
  5. Yazmín Ibáñez-García(Cardiff University)

Keywords

  1. Reasoning System Implementation
  2. Entity Resolution
  3. Data Quality
  4. Answer Set Programming
  5. Databases

Abstract

In this paper, we present ASPEN+, which extends an existing

ASP-based system, ASPEN,for collective entity resolution

with two important functionalities: support for local

merges and new optimality criteria for preferred solutions.

Indeed, ASPEN only supports so-called global merges of

entity-referring constants (e.g. author ids), in which all

occurrences of matched constants are treated as equivalent

and merged accordingly. However, it has been argued that

when resolving data values, local merges are often more

appropriate, as e.g. some instances of ‘J. Lee’ may refer

to ‘Joy Lee’, while others should be matched with ‘Jake

Lee’. In addition to allowing such local merges, ASPEN+

offers new optimality criteria for selecting solutions,

such as minimizing rule violations or maximising the number

of rules supporting a merge. Our main contributions are

thus (1) the formalisation and computational analysis of

various notions of optimal solution, and (2) an extensive

experimental evaluation on real-world datasets,

demonstrating the effect of local merges and the new

optimality criteria on both accuracy and runtime.