Online event. November 3-12, 2021.
Copyright © 2021 International Joint Conferences on Artificial Intelligence Organization
Boolean networks (BNs) are one of the standard tools for modeling gene
regulatory networks in biology but their learning has been limited to
small networks due to computational difficulty. Aiming at
unprecedented scalability, we focus on a subclass of BNs called AND/OR
Boolean networks where Boolean formulas are restricted to a
conjunction or a disjunction of literals. We represent an AND/OR BN
with N nodes by an N x 2N binary matrix Q paired with
an N dimensional integer vector theta called a threshold vector,
a state of the BN by an N dimensional binary state vector s
and a state transition by matrix operations on Q, theta and s.
Given a list of state transitions S = s_0...s_L,
we learn Q and theta in a continuous space by
minimizing a cost function J(Q*,theta,S) w.r.t.
a real number matrix Q* and theta while thresholding
Q* into a binary matrix Q using theta so that
Q represents an AND/OR BN realizing the target state
We conducted experiments with artificial and real data sets to check
scalability and accuracy of our learning algorithm. First we randomly
generated AND/OR BNs up to N=5,000 nodes and empirically confirmed
O(N^2) learning time behavior using them. We also observed 99.8%
bit-by-bit prediction accuracy (prediction accuracy = 1 - test error)
with state transition data generated by AND/OR BNs. For real data,
we learned genome-wide AND/OR BNs with 10,928 nodes for budding
yeast from transcription profiling data sets, each containing 10,928
mRNAs and 40 transitions and achieved for instance 84.3%
prediction accuracy and successfully extracted more than 6,000 small
AND/ORs whose average prediction accuracy reaches much higher 94.9%.