Scalable algorithms for systems biology
Optimization models are used to predict cellular behavior--over 25 years in the case of cell metabolism.
The constraints in these models are reconstructed from genome annotations,
measured macromolecular composition, and by measuring phenotype in different conditions.
The cellular goal can be challenging to define for many organisms, including
human tissue, microbial pathogens, and cancer cells.
A promising approach is to estimate these goals directly from
omics measurements, given a starting metabolic reconstruction.
A particuarly flexible method is estimating new linear constraints
that model unknown biochemical reactions that constrain the cell's operation.
However, this approach requires solving a nonconvex optimization problem,
which may not scale to large models.
To tackle this challenge, we develop scalable algorithms using distributed computing
on CPUs and GPUs.
Our algorithms thus learn new models from high-throughput data sets,
leading to increasingly accurate prediction of cellular behavior
under conditions that were previously difficult to model.
Researchers will have ample opportunity to deploy machine learning
and distributed algorithms on big biological data sets.
These algorithms can improve the accuracy of model predictions,
or to help understand biological mechanisms by constructing explainable models from data.
Our lab aims to:
-
develop scalable algorithms to
estimate model parameters
from multi-omics data (i.e., data sets comprised of multiple omics technologies)
-
learn models of metabolism and protein expression
from multi-omics data, including microbial community models and
host-pathogen models
Related manuscripts: