Drug discovery aims finding molecules that manipulate enzymes in
order to increase or decrease the production of desired compounds
while incurring minimum side-effects. An important part of this
problem is the identification of the target enzymes, i.e., the
enzymes that will be inhibited by the drug molecules. Finding the
right set of target enzymes is essential for developing a successful
drug. The relationship between enzymes and compounds through
reactions is defined using metabolic networks. Finding the best set
of target enzymes requires a careful analysis of the underlying
metabolic network.
This paper presents the problem of finding the set of enzymes, whose
inhibition stops the production of a given set of target compounds,
while eliminating minimal number of non-target compounds. Here, target
compounds are the compounds whose presence cause the underlying
disorder. The non-target compounds are all the remaining compounds.
We call this problem Target Identification by Enzymes (TIE).
An exhaustive evaluation of all possible enzyme combinations in the
metabolic network to find the optimal solution is computationally
infeasible for very large metabolic networks. We developed a scalable
iterative method which computes a sub-optimal solution within
reasonable time-bounds. The method consists of two phases: Iteration
and Fusion Phases. The Iteration Phase is based on the intuition that
a good solution can be found by tracing backward from the target
compounds. It initially evaluates the immediate precursors of the
target compounds and iteratively moves backwards to identify the
enzymes whose inhibition incurs less side-effects. This phase
converges to a sub-optimal solution after a small number of
iterations. The Fusion Phase takes the union of a set of sub-optimal
results found at the Iteration Phase. Each set, here, is a potential
solution. It then increases this set by inserting a small subset of
the remaining enzymes randomly. The size of the final set is bounded
by the time allowed for the exhaustive search. The Fusion Phase
exhaustively searches the final set to find the optimal subset of
enzymes from this set. It, then, recursively creates a new set by
inserting random enzymes to the optimal solution found so far and
exhaustively searches this set again until a predefined number of
iterations are performed.
The experiments on the E. coli metabolic network show that the
average accuracy of the Iteration Phase alone deviates from that of
the exhaustive search only by 0.02 %. The Iteration Phase is highly
scalable. It can solve the problem for the entire metabolic network of
\emph{Escherichia coli} in less than 10 seconds. The Fusion Phase
improves the accuracy of the Iteration Phase by 19.3 %.