Support
vector training of protein alignment models
|
|
Author(s): Yu, Chun-Nam John (cnyu@cs.cornell.edu);
Joachims, Thorsten (tj@cs.cornell.edu); Elber, Ron (ron@cs.cornell.edu);
Pillardy, Jaroslaw (jarekp@tc.cornell.edu) |
|
Editor(s): Speed, T;
Huang, H |
Source:
Research in Computational Molecular Biology,
Proceedings Pages: 253-267
Published: 2007 |
Series:
LECTURE NOTES IN COMPUTER SCIENCE : 4453 |
Abstract:
Sequence to structure alignment is an important
step in homology modeling of protein structures.
Incorporation of features like secondary structure,
solvent accessibility, or evolutionary information
improve sequence to structure alignment accuracy, but
conventional generative estimation techniques for
alignment models impose independence assumptions that
make these features difficult to include in a principled
way. In this paper, we overcome this problem using a
Support Vector Machine (SVM) method that provides a
well-founded way of estimating complex alignment models
with hundred-thousands of parameters. Furthermore, we
show that the method can be trained using a variety of
loss functions. In a rigorous empirical evaluation, the
SVM algorithm outperforms the generative alignment
method SSALN, a highly accurate generative alignment
model that incorporates structural information. The
alignment model learned by the SVM aligns 47% of the
residues correctly and aligns over 70% of the residues
within a shift of 4 positions. |
|