Author(s):
Vallat, Brinda Kizhakke;
Pillardy, Jaroslaw;
Elber, Ron |
Source:
Proteins Volume:
72 Issue: 3 Pages:
910-28 Published:
2008 Aug 15 |
Abstract:
The first step in homology modeling is to identify a
template protein for the target sequence. The template
structure is used in later phases of the calculation to
construct an atomically detailed model for the target.
We have built from the Protein Data Bank (PDB) a
large-scale learning set that includes tens of millions
of pair matches that can be either a true template or a
false one. Discriminatory learning (learning from
positive and negative examples) is used to train a
decision tree. Each branch of the tree is a mathematical
programming model. The decision tree is tested on an
independent set from PDB entries and on the sequences of
CASP7. It provides significant enrichment of true
templates (between 50 and 100%) when compared to
PSI-BLAST. The model is further verified by building
atomically detailed structures for each of the tentative
true templates with modeller. The probability that a
true match does not yield an acceptable structural model
(within 6 A RMSD from the native structure) decays
linearly as a function of the TM structural-alignment
score. |
PubMed
ID: 18300226 |
|