Searching for strong structural protein similarities with EAST
Keywords:
bioinformatics, proteins, soft computing, data mining, protein structure comparisonAbstract
The exploration of protein conformation can be supported by methods of similarity searching that allow seeking the 3D patterns in a database containing many molecular structures. We developed a novel search method called EAST (Energy Alignment Search Tool), which serves as a tool for finding strong structural similarities of proteins. It differs from other algorithms that concentrate on fold similarities. We use the EAST to find protein molecules representing the same structural protein family and inspect conformational modifications in their molecular structures as an effect of biochemical reactions or environmental influences. The similarity searching is performed through the comparison and alignment of protein energy profiles. Energy profiles are received in the computational process based on the molecular mechanics theory. These profiles are stored in the special database (Energy Distribution Data Bank, EDB) and can be used later by the search engine to find similar fragments of protein structures on the energy level. In order to optimize the alignment path we use modified, energy-adapted Smith-Waterman method, which is one of the main phases of the EAST. The use of fuzzy techniques improves the fault tolerance of presented method and allows to measure the quality of the alignment. In the paper, we present the main idea of the EAST algorithm and brief discussion on its basic parameters. Finally, we give an example of the system usage regarding proteins from the RAB family that play an important role in intracellular reactions in living organisms.
References
[2] S.F. Altschul, B.W. Erickson. Optimal sequence alignment using affine gap costs. Bull. Math. Biol., 48(5-6): 603-616, 1986.
[3] S.F. Altschul, W. Gish, W. Miller, E.W. Myers, D.J. Lipman. Basic local alignment search tool. J. Mol. Biol., 215: 403-410, 1990.
[4] H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, et al. The protein data bank. Nucleic Acids Res., 28: 235-242, 2000.
[5] C. Branden, J. Tooze. Introduction to Protein Structure. Garland, 1991.