Skip to main content
No Access

An interval-based algorithm to represent conformational states of experimentally determined polypeptide templates and fast prediction of approximated 3D protein structures

Published Online:pp 462-486

Predicting the three-dimensional (3-D) structure of a protein that has no templates in the Protein Data Bank (PDB) is a very hard, still an impossible task. Computational prediction methods have been developed during the last years, but the problem still remains challenging. In this paper we present a new strategy based on Interval Arithmetic to store structural information obtained from experimental protein templates and predict native-like approximate three-dimensional structures of proteins. Our objective is to perform the prediction in a very fast manner and predict native-like structures that can be used as starting point structures to ab initio methods. We illustrate the efficacy of our method in five case studies of polypeptides.

Keywords

structural bioinformatics, 3-D protein structure prediction, interval arithmetic, pattern recognition, data mining

References

  • 1. Alefeld, G. , Herzberger, J. (1983). Introduction to Interval Computations.. 1st ed., New York, USA:Academic Press Google Scholar
  • 2. Altschul, S.F. , Madden, T.L. , Schaffer, A.A. , Zhang, J. , Zhang, Z. , Miller, W. , Lipman, D.J. (1997). ‘Gapped blast and psi-blast: a new generation of protein database search programs’. Nucleic Acids Res.. 25, 17, 3389-3402 MedlineGoogle Scholar
  • 3. Arnold, K. , Bordoli, L. , Kopp, J. , Schwede, T. (2006). ‘The swiss-model workspace: A web-based environment for protein structure homology modelling’. Bioinformatics. 22, 2, 195-201 MedlineGoogle Scholar
  • 4. Banner, D.W. , Kokkinidis, M. , Tsernoglou, D. (1987). ‘Structure of the cole1 rop protein at 1.7 a resolution’. J. Mol. Biol.. 196, 3, 657-675 MedlineGoogle Scholar
  • 5. Baxevanis, A.D. , Quellette, B.F. (2001). Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins.. 3rd ed., New Jersey, USA:John Wiley and Sons Inc. Google Scholar
  • 6. Berman, H.M. , Westbrook, J. , Feng, Z. , Gilliland, G. , Bath, T.N. , Weissig, H. , Shindyalov, I.N. , Bourne, P.E. (2000). ‘The protein data bank’. Nucleic Acids Res.. 28, 1, 235-242 MedlineGoogle Scholar
  • 7. Branden, C. , Tooze, J. (1998). Introduction to Protein Structure. 2nd ed., New York, USA:Garlang Publishing Inc. Google Scholar
  • 8. Breda, A. , Santos, D.S. , Basso, L.A. , Norberto de Souza, O. (2007). ‘Ab initio 3-d structure prediction of an artificially designed three-a-helix bundle via all-atom molecular dynamics simulations’. Genet. Mol. Res.. 6, 4, 901-910 MedlineGoogle Scholar
  • 9. Bryant, S.H. , Altschul, S. (1995). ‘Statistics of sequence-structure threading’. Curr. Opin. Struct. Biol.. 5, 2, 236-244 MedlineGoogle Scholar
  • 10. Bujnicki, J.M. (2006). ‘Protein structure prediction by recombination of fragments’. ChemBioChem. 7, 1, 19-27 MedlineGoogle Scholar
  • 11. Case, D.A. , Cheatham, T.E. , Darden, T. , Gohlke, H. , Luo, R. , Merz, K.M. , Onufriev, A. (2005). ‘The amber biomolecular simulation program’. J. Comput. Chem.. 26, 16, 1668-1688 MedlineGoogle Scholar
  • 12. Chapman, B. , Chang, J. (2000). ‘Biopython: python tools for computational biology’. ACM SIGBIO Newsl.. 20, 2, 15-19 Google Scholar
  • 13. Cheng, J. , Randall, A. , Sweredoski, P. , Baldi, M. (2005). ‘Scratch: a protein structure and structural feature prediction server’. Nucleic Acids Res.. 33, 2, 72-76 Google Scholar
  • 14. Combet, C. , Blanchet, C. , Geourjoun, C. , Deleage, G. (2000). ‘Nps: Network protein sequence analysis’. Trends Biochem. Sci.. 25, 3, 147-150 MedlineGoogle Scholar
  • 15. Cornell, W.D. , Cieplak, p. , Bayly, C.I. , Gould, I.R. , Merz, K.M., Jr. , Ferguson, D.M. , Spellmeyer, D.C. , Fox, T. , Caldwell, J.W. , Kollman, P.A. (1995). ‘A second generation force field for the simulation of proteins, nucleic acids, and organic molecules’. J. Am. Chem. Soc.. 117, 19, 5179-5197 Google Scholar
  • 16. Creighton, T.E. (1990). ‘Protein folding’. Biochem. J.. 270, 1-16 MedlineGoogle Scholar
  • 17. Delano, W.L. (2002). The PyMOL Molecular Graphics System. San Carlos, CA, USA:Delano Scientific Google Scholar
  • 18. Deleage, G. , Roux, B. (1987). ‘An algorithm for protein secondary structure prediction based on class prediction’. Protein Eng.. 1, 4, 289-294 MedlineGoogle Scholar
  • 19. Dorn, M. , Breda, A. , Norberto de Souza, O. (2008). ‘A hybrid method for the protein structure prediction problem’. Lect. Notes Bioinf.. 5167, 47-56 Google Scholar
  • 20. Dorn, M. , Norberto de Souza, O. (2008). ‘Cref: A central-residue-fragment-based method for predicting approximate 3-d polypeptides structures’. Proceedings of the ACM Symposium on Applied computing. Fortaleza, Brazil, 1261-1267 Google Scholar
  • 21. Dorn, M. , Norberto de Souza, O. (2010a). ‘Mining the protein data bank with cref to predict approximate 3-d structures of polypeptides’. Int. J. Data Min. Bioin.. 4, 3, 281-299 AbstractGoogle Scholar
  • 22. Dunbrack, R.L., Jr. , Karplus, M. (1993). ‘Backbone-dependent rotamer library for proteins: application to side-chain prediction’. J. Mol. Biol.. 230, 2, 543-574 MedlineGoogle Scholar
  • 23. Eswar, N. , Martí-Renom, M.A. , Webb, B. , Madhusudhan, M.S. , Eramian, D. , Shen, M. , pieper, U. , Sali, A. (2006). ‘Comparative protein structure modelling with modeller’. Curr. Protoc. Bioinf.. 15, 56, 1-30 Google Scholar
  • 24. Fiser, A. , Do, R.K.G. , Sali, A. (2000). ‘Modelling of loops in protein structure’. Protein Sci.. 9, 9, 1753-1773 MedlineGoogle Scholar
  • 25. Floudas, C.A. , Fung, H.K. , McAllister, S.R. , Mnnigmann, M. , Rajgaria, R. (2006). ‘Advances in protein structure prediction and de novo protein design: A review’. Chem. Eng. Sci.. 61, 3, 966-988 Google Scholar
  • 26. Garnier, J. , Gibrat, J-F. , Robson, B. (1996). ‘Gor secondary structure prediction method version iv’. Methods Enzymol.. 266, 540-553 MedlineGoogle Scholar
  • 27. Garnier, J. , Osguthorpe, D.J. , Robson, B. (1978). ‘Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins’. J. Mol. Biol.. 120, 1, 97-120 MedlineGoogle Scholar
  • 28. Geourjon, C. , Deleage, G. (1994). ‘Sopm: a self-optimised method for protein secondary structure prediction’. Protein Eng.. 7, 2, 157-164 MedlineGoogle Scholar
  • 29. Geourjon, C. , Deleage, G. (1995). ‘Sopma: significant improvements in protein secondary structure prediction by consensus prediction from multiple alignments’. Comput. Appl. Biosci.. 11, 6, 681-684 MedlineGoogle Scholar
  • 30. Gibrat, J.F. , Garnier, J. , Robson, B. (1987). ‘Further developments of protein secondary structure prediction using information theory. new parameters and consideration of residue pairs’. J. Mol. Biol.. 198, 3, 425-443 MedlineGoogle Scholar
  • 31. Graybeal, W.J. , Pooch, U.W. (1980). Simulation: Principles and Methods.. 1st ed., Cambridge, UK:Cambridge Winthrop Publishers Inc. Google Scholar
  • 32. Gronenborn, A.M. , Filpula, D.R. , Essig, N.Z. , Achari, A. , Whitlow, M. , Wingfield, P.T. , Clore, G.M. (1991). ‘A novel, highly stable fold of the immunoglobulin binding domain of streptococcal protein g’. Science. 253, 5020, 657-661 MedlineGoogle Scholar
  • 33. Guermeur, Y. , Geourjon, C. , Gallinari, p. , Deleage, G. (1996). ‘Incorporation of non-local interactions in protein secondary structure prediction from the amino acid sequence’. Protein Eng.. 9, 2, 133-142 MedlineGoogle Scholar
  • 34. Guermeur, Y. , Geourjon, C. , Gallinari, P. , Deleage, G. (1999). ‘Improved performance in protein secondary structure prediction by inhomogeneous score combination’. Bioinformatics. 15, 5, 413-421 MedlineGoogle Scholar
  • 35. Hovmôller, T.Z. , Zhou, T.Z. , Ohlson, T. (2002). ‘Conformation of amino acids in protein’. Acta Crystallogr., Sect. D: Biol. Crystallogr.. 58, 5, 768-776 MedlineGoogle Scholar
  • 36. Hutchinson, E.G. , Thornton, J.M. (1996). ‘Promotif: A program to identify and analyse structural motifs in proteins’. Protein Sci.. 5, 2, 212-220 MedlineGoogle Scholar
  • 37. Jacobson, M.P. , Friesner, R.A. , Xiang, Z. , Honig, B. (2002). ‘On the role of the crystal environment in determining protein side-chain conformations’. J. Mol. Biol.. 320, 3, 597-608 MedlineGoogle Scholar
  • 38. Jones, D.T , Taylor, W.R. , Thornton, J.M. (1992). ‘A new approach to protein fold recognition’. Nature. 358, 6381, 86-89 MedlineGoogle Scholar
  • 39. Jones, D. (1999). ‘Genthreader: an efficient and reliable protein fold recognition method for genomic sequences’. J. Mol. Biol.. 287, 4, 797-815 MedlineGoogle Scholar
  • 40. Jones, D. (2001). ‘Predicting novel protein folds by using fragfold’. Proteins. 45, S5, 127-132 Google Scholar
  • 41. Kabsch, W. , Sander, C. (1983). ‘Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features’. Biopolymers. 22, 12, 2577-2637 MedlineGoogle Scholar
  • 42. King, R.D. , Sternberg, M.J. (1996). ‘Identification and application of the concepts important for accurate and reliable protein secondary structure prediction’. Proteins. 5, 11, 2298-2310 Google Scholar
  • 43. Kolinski, A. (2004). ‘Protein modelling and structure prediction with a reduced representation’. Acta Biochim. Pol.. 51, 349-371 MedlineGoogle Scholar
  • 44. Kundrot, C.E. , ponder, J.W. , Richards, F.M. (2004). ‘Algorithms for calculating excluded volume and its derivatives as a function of molecular conformation and their use in energy minimisation’. J. Comput. Chem.. 12, 3, 402-409 Google Scholar
  • 45. Laskowski, R.A. , MacArthur, M.W. , Moss, D.S. , Thornton, J.M. (1993). ‘Procheck: a program to check the stereochemical quality of protein structures’. J. Appl. Crystallogr.. 26, 2, 283-291 Google Scholar
  • 46. Lesk, A.M. (2000). Introduction to Protein Architecture: the Structural Biology of Proteins.. 1st ed., Cambridge, UK:Oxford University Press Google Scholar
  • 47. Levinthal, C. (1968). ‘Are the pathways for protein folding?’. J. Chem. Phys.. 65, 44-45 Google Scholar
  • 48. Mace, J.E. , Agard, D.A. (1995). ‘Kinetic and structural characterisation of muta-tions of glycine 216 in alpha-lytic protease: a new target for engineering substrate specificity’. J. Mol. Biol.. 254, 4, 720-736 MedlineGoogle Scholar
  • 49. Stuart, A. , Fiser, A. , Sanchez, R. , Mello, F. , Martì-Renom, M.A. , Sali, A. (2000). ‘Comparative protein structure modelling of genes and genomes’. Annu. Rev. Biophys. Biomol. Struct.. 29, 16, 291-235 MedlineGoogle Scholar
  • 50. McLachlan, A.D. (1982). ‘Rapid comparison of protein structures’. Acta Crystallogr.. A38, 871-873 Google Scholar
  • 51. Moore, R.E. , Yang, C.T. (1959). ‘Interval analysis’. Sunnyvale, CA, USA:Lockheed Missiles and Space Co. , Technical Report Space Div. Report LMSD285875, Technical Report Google Scholar
  • 52. Moore, R.E. (1959). Automatic Error Analysis in Digital Computation. Sunnyvale, CA, USA:Lockheed Missiles and Space Co. , Technical Report Space Div. Report LMSD84821, Technical report Google Scholar
  • 53. Moore, R.E. (1966). Interval Analysis.. 1st ed., New Jersey, USA:Prentice-Hall Google Scholar
  • 54. Moore, R.E. (1999). ‘The dawning’. Reliab. Comput.. 5, 423-424 Google Scholar
  • 55. Morris, A.L. , MacArthur, M.W. , Hutchinson, E.G. , Thornton, J.M. (1992). ‘Stereochemical quality of protein structure coordinates’. Proteins. 12, 4, 345-364 MedlineGoogle Scholar
  • 56. Moult, J.A. (2005). ‘Decade of casp: progress, bottlenecks an prognosis in protein structure prediction’. Curr. Opin. Struct. Biol.. 15, 285-289 MedlineGoogle Scholar
  • 57. Ngo, J.T. , Marks, J. , Karplus, M. (1997). The Protein Folding Problem and Tertiary Structure Prediction. Boston, MA, USA, chapter Computational complexity, protein structure prediction and the Levinthal Paradox Google Scholar
  • 58. Osguthorpe, D.J. (2000). ‘Ab initio protein folding’. Curr. Opin. Struct. Biol.. 10, 2, 146-152 MedlineGoogle Scholar
  • 59. Pappu, R.V. , Hart, R.K. , Ponder, J.W. (1998). ‘Analysis and application of potential energy smoothing for global optimisation’. J. Phys. Chem. B. 105, 9725-9742 Google Scholar
  • 60. Pastor, M.T. , Lopez de la Paz, M. , Lacroix, E. , Serrano, L. , Perez-Paya, E. (2002). ‘Combinatorial approaches: a new tool to search for highly structured beta-hairpin peptides’. Proc. Natl. Acad. Sci. USA. 99, 2, 614-619 MedlineGoogle Scholar
  • 61. Ponder, J.W. , Richards, F.M. (2007). ‘An efficient newton-like method for molecular mechanics energy minimisation of large molecules’. Comput. Chem.. 8, 1016-1024 Google Scholar
  • 62. Ramachandran, G.N. , Sasisekharan, V. (1968). ‘Conformation of polypeptides and proteins’. Adv. Protein Chem.. 23, 238-437 Google Scholar
  • 63. Raman, S. , Vernon, R. , Thompson, J. , Tyka, M. , Sadreyev, R. , Pei, J. , Kim, D. , Kellogg, E. , Dimaio, F. , Lange, O. , Kinch, L. , Sheffler, W. , Kim, B. , Das, R. , Grishin, N. , Baker, D. (2009). ‘Structure prediction for casp8 with all-atom refinement using rosetta’. Proteins. 77, 89-99 MedlineGoogle Scholar
  • 64. Ren, P. , Ponder, J.W. (2003). ‘Polarisable atomic multipole water model for molecular mechanics simulation’. J. Phys. Chem. B. 107, 24, 5933-5947 Google Scholar
  • 65. Rohl, C.A. , Strauss, C. , Misura, K. , Baker, D. (2004). ‘Protein structure prediction using rosetta’. Methods Enzymol.. 383, 66-93 MedlineGoogle Scholar
  • 66. Rost, B. , Sander, C. (1993). ‘Prediction of protein secondary structure at better than 70% accuracy’. J. Mol. Biol.. 232, 584-599 MedlineGoogle Scholar
  • 67. Sánchez, R. , Sali, A. (1997). ‘Advances in comparative protein-structure modelling’. Curr. Opin. Struct. Biol.. 7, 2, 206-214 MedlineGoogle Scholar
  • 68. Simons, K.T. , Bonneau, R. , Ruczinski, I. , Baker, D. (1999). ‘Ab initio protein structure prediction of casp iii targets using rosetta’. Proteins. 3, 171-176 MedlineGoogle Scholar
  • 69. Srinivasan, R. , Rose, G.D. (1995). ‘Linus – a hierarchic procedure to predict the fold of a protein’. Proteins. 22, 2, 81-99 MedlineGoogle Scholar
  • 70. Srinivasan, R. , Rose, G.D. (2002). ‘Ab initio prediction of protein structure using linus’. Proteins. 47, 4, 489-495 MedlineGoogle Scholar
  • 71. Starovasnik, M.A. , Braisted, A.C. , Wells, J.A. (1997). ‘Structural mimicry of a native protein by a minimised binding domain’. Proc. Natl. Acad. Sci. USA. 94, 10080-10085 MedlineGoogle Scholar
  • 72. Tramontano, A. (2006). Protein Structure Prediction. 1st ed., Weinheim, Germany:John Wiley and Sons Inc. Google Scholar
  • 73. van der Spoel, D. , Lindahl, E. , Hess, B. , Groenhof, G. , Mark, A. , Berendsen, H. (2005). ‘Gromacs: fast, flexible, and free’. J. Comput. Chem.. 26, 16, 1701-1718 MedlineGoogle Scholar
  • 74. van Gunsteren, W.F. , Berendsen, H.J.C. (1990). ‘Computer simulation of molecular dynamics: methodology, applications, and perspectives in chemistry’. Angew. Chem., Int. Ed. Engl.. 29, 9, 992-1023 Google Scholar
  • 75. Witten, I.H. , Frank, E. (2006). Data Mining: Practical Machine Learning Tools and Techniques.. 2nd ed., Oxford, UK, Morgan Kaufmann Series in Data Management Systems Google Scholar
  • 76. Xiang, Z. , Honig, B. (2001). ‘Extending the accuracy limits of prediction for side-chain conformations’. J. Mol. Biol.. 311, 2, 421-430 MedlineGoogle Scholar
  • 77. Zhang, X. , Waltz, D. , Mesirov, J. (1989). ‘Protein structure prediction by a data-level parallel algorithm’. Proceedings of the ACM/IEEE Conference on Supercomputing. Reno, NV, USA, 215-223 Google Scholar
  • 78. Zhang, Y. (2007). ‘Template-based modelling and free modelling by i-tasser in casp7’. Proteins. 8, 108-117 Google Scholar
  • 79. Zhang, Y. (2008). ‘I-tasser server for protein 3d structure prediction’. BMC Bioinf.. 9, 40, 1-8 MedlineGoogle Scholar
  • 80. Zhang, Y. (2009). ‘I-tasser: Fully automated protein structure prediction in casp8’. Proteins. 77, S9, 100-113 MedlineGoogle Scholar
  • 81. Zhirong, S. , Jiang, B. (1996). ‘Patters and conformations of commonly occurring supersecondary structures (basic motifs) in protein data bank’. J. Protein Chem.. 15, 7, 675-690 MedlineGoogle Scholar