New in Silico approaches for Metabolic Engineering

 

(Chapter 2, 3,.. means the chapters of my thesis)

This project focuses on in silico methods for metabolic engineering. Metabolic engineering
discusses methods to manipulate the metabolism to approach a given goal, e.g. increasing the
cellsĄŻ production of a certain substance by operating the genetic and regulatory processes. It is
widely applied to drug discovery, food industry and cosmetics.

In order to change the metabolism to the desired one, e.g. increasing the production of
a certain compound, one metabolic engineering method is to use chemical compounds (i.e.
drugs) to inhibit a set of enzymes. When an enzyme is inhibited, it cannot catalyze the reactions
it is responsible from. As a result, the production of the system may increase and some may
decrease. The enzymatic target identification problem is how to identify the set of enzymes
whose knockouts lead to the system status close to the goal. If we consider the steady state,
there are three models for this problem. Boolean model is the simplest one. We develop a
scalable iterative method for the problem with boolean model. Linear model is a more reasonable
model than boolean model, which applies to the flux distribution analysis well. We prove that
finding the enzyme knockout strategy by OptKnock framework is NP-hard and present methods
considering multiple enzyme association. Non-linear models can simulate the whole cell system
and describe the biological system in a reasonable way. We design algorithms for the enzymatic
target identification which can apply to the non-linear model. If we consider the transient state of the system when we study the problem, we present a pattern distance to evaluate two transient
states. We will design the algorithms for this situation in the next few months before the defense.

  • Enzymatic target identification by boolean network models
    • Optimal approaches, Related paper: Padmavati Sridhar, Bin Song, Tamer Kahveci and Sanjay Ranka, Mining metabolic networks for optimal drug targets, Pacific Symposium on Biocomputing (PSB), 13: 291-302, 2008 (abstract) (pdf)
    • Heuristic approaches, Related paper:Bin Song, Padmavati Sridhar, Tamer Kahveci and Sanjay Ranka, Double Iterative Optimization for Metabolic Network-Based Drug Target Identification, International Journal of Data Mining and Bioinformatics, 3(2):145-159, 2009 (abstract).
  • Enzymatic target identification using linear model for multiple enzymes catalyze the same reaction
    • Related paper: Bin Song, I. Esra Buyuktahtakin, Sanjay Ranka, Tamer Kahveci, A linear programming framework to identify enzyme knockout strategies for multiple enzymes catalyze the same reaction, ready to submit.
  • Enzymatic target identification using non-linear model by manipulating the steady state
    • Web page
    • Related paper:Bin Song, I. Esra Buyuktahtakin, Sanjay Ranka, Tamer Kahveci, Manipulating the steady state of metabolic pathways, IEEE/ACM Transactions on Computational Biology and Bioinformatics (IEEE TCBB), accepted for publication
  • Enzymatic target identification by dynamic state similarity analysis
    • It's a ongoing project. Please see the progress on the Webpage.

Once we identify which enzyme set should inhibit, the next step is to select chemical
compounds (i.e. drugs) to alter the activity of these enzymes. One of the popular compound
selection methods is to screen libraries of small compounds for their ability to bind to biological
targets such as receptors and enzymes in silico. We develop two novel computational methods
that rank a given set of compounds for a given target protein or enzyme.

  • New compound selection approaches integrating structural properties of proteins
    and metabolic networks
    • Related paper: Bin Song, Tamer Kahveci, Sanjay Ranka, Shalesh Kaushal, and Syed M. Noorwez, Integrating structural properties of proteins and biological networks improves compound selection, read to submitted
 

Finding distant structural similarities in protein database

Structural similarities in distantly related proteins are the significant information in the protein data sets. For example, they can reveal functional relationships that can not be identified using sequence comparison. We provide an algorithm for computing the transformation of a protein to align another protein. Our experiments show that our method outperforms existing methods.

  • Related paper: Jayendra Venkateswaran, Bin Song, Tamer Kahveci, Christopher Jermaine, TRIAL: A Tool for Finding Distant Structural Similarities, IEEE/ACM Transactions on Computational Biology and Bioinformatics (IEEE TCBB), accepted for publication.
 

Domain Detection for protein sequence databases

Biologists frequently align multiple biological sequences to determine consensus sequences and/or search for predominant residues and conserved regions. Particularly, determining conserved regions in an alignment is one of the most important activities. Since protein sequences are often several-hundred residues or longer, it is difficult to distinguish biologically important conserved regions (motifs or domains) from others. Thus a computational tool that can highlight biologically important regions accurately will be highly desired.

  • Related paper: Bin Song, Jeong-Hyeon Choi, Guangyu Chen, et al, ARCS: an aggregated related column scoring scheme for aligned sequences, Bioinformatics, 1(22): 2326-2332, 2006 (pdf)
 

Web server for maize kernel composistion

Grain composition and yield are two important targets for improving food security and reducing the environmental impact of agriculture. Biologists collect a large number of maize seed weights and near infrared reflectance (NIR) spectra data for individual corn kernels. We build web server for analysis of complex data sets such as NIR spectra.

 
 
 
 
 
 

Copyright 2009 Bin Song All Rights Reserved