Melbourne, Australia. November 11-17, 2025.
ISSN: 2334-1033
ISBN: 978-1-956792-08-9
Copyright © 2025 International Joint Conferences on Artificial Intelligence Organization
Safe Reinforcement Learning focuses on developing optimal
policies while ensuring safety. A popular method to address
such task is shielding, in which a correct-by-construction
safety component is synthetised from logical specifications.
Recently, shield synthesis has been extended to
infinite-state
domains, such as continuous environments. This makes
shielding more applicable to realistic scenarios. However,
often shields might be unrealizable because the specification
is inconsistent. In order to address this gap, we present a
method to obtain simple unconditional and conditional explanations
that witness unrealizability, which goes by temporal
formula unrolling: bounded strategy search. In this paper,
we
show different variants of the technique as well as its
applicability