KR2021Proceedings of the 18th International Conference on Principles of Knowledge Representation and ReasoningProceedings of the 18th International Conference on Principles of Knowledge Representation and Reasoning

Online event. November 3-12, 2021.

Edited by

ISSN: 2334-1033
ISBN: 978-1-956792-99-7

Sponsored by
Published by

Copyright © 2021 International Joint Conferences on Artificial Intelligence Organization

Boolean Network Learning in Vector Spaces for Genome-wide Network Analysis

  1. Taisuke Sato(National Institute of Informatics (NII))
  2. Ryosuke Kojima(Kyoto University)

Keywords

  1. Interplay between logic & neural and other learning paradigms
  2. KR and machine learning, inductive logic programming, knowledge acquisition
  3. Machine learning for efficient knowledge inference
  4. Applications of KR in bioinformatics

Abstract

Boolean networks (BNs) are one of the standard tools for modeling gene

regulatory networks in biology but their learning has been limited to

small networks due to computational difficulty. Aiming at

unprecedented scalability, we focus on a subclass of BNs called AND/OR

Boolean networks where Boolean formulas are restricted to a

conjunction or a disjunction of literals. We represent an AND/OR BN

with N nodes by an N x 2N binary matrix Q paired with

an N dimensional integer vector theta called a threshold vector,

a state of the BN by an N dimensional binary state vector s

and a state transition by matrix operations on Q, theta and s.

Given a list of state transitions S = s_0...s_L,

we learn Q and theta in a continuous space by

minimizing a cost function J(Q*,theta,S) w.r.t.

a real number matrix Q* and theta while thresholding

Q* into a binary matrix Q using theta so that

Q represents an AND/OR BN realizing the target state

transitions S.

We conducted experiments with artificial and real data sets to check

scalability and accuracy of our learning algorithm. First we randomly

generated AND/OR BNs up to N=5,000 nodes and empirically confirmed

O(N^2) learning time behavior using them. We also observed 99.8%

bit-by-bit prediction accuracy (prediction accuracy = 1 - test error)

with state transition data generated by AND/OR BNs. For real data,

we learned genome-wide AND/OR BNs with 10,928 nodes for budding

yeast from transcription profiling data sets, each containing 10,928

mRNAs and 40 transitions and achieved for instance 84.3%

prediction accuracy and successfully extracted more than 6,000 small

AND/ORs whose average prediction accuracy reaches much higher 94.9%.