Author(s): Yu CNJ
(Yu, Chun-Nam John)1,
Joachims T (Joachims, Thorsten)1,
Elber R (Elber, Ron)1,
Pillardy J (Pillardy, Jaroslaw)2
|
Source: JOURNAL OF
COMPUTATIONAL BIOLOGY Volume:
15 Issue: 7 Pages:
867-880 Published: SEP 2008
|
Abstract: Sequence
to structure alignment is an important step in homology modeling of
protein structures. Incorporation of features such as secondary
structure, solvent accessibility, or evolutionary information improve
sequence to structure alignment accuracy, but conventional generative
estimation techniques for alignment models impose independence
assumptions that make these features difficult to include in a
principled way. In this paper, we overcome this problem using a Support
Vector Machine (SVM) method that provides a well-founded way of
estimating complex alignment models with hundred of thousands of
parameters. Furthermore, we show that the method can be trained using a
variety of loss functions. In a rigorous empirical evaluation, the SVM
algorithm outperforms the generative alignment method SSALN, a highly
accurate generative alignment model that incorporates structural
information. The alignment model learned by the SVM aligns 50% of the
residues correctly and aligns over 70% of the residues within a shift of
four positions. |