Predicting protein folding rates from amino acid sequences
Protein folding speeds are known to vary over more than 8 orders of magnitude. Plaxco, Simons, and Baker first showed a correlation of folding speed with the topology of the native protein. That and subsequent studies showed that if the native structure of a protein is known, it's folding speed can be predicted reasonably well through a logarithmic correlation with the "localness" of the contacts in the protein. In the present work, we develop a related measure, the geometric contact number, N_alpha, which is the number of nonlocal contacts that are well-packed, by a Voronoi criterion. We found, first, that in 80 proteins, the largest such database of proteins yet studied, N_alpha is an excellent predictor of folding speeds of both two-state fast-folders and more complex multi-state folders. It supports the view that folding occurs by a mechanism of zipping and assembly, where shorter loops are entropically faster to form than longer ones. Second, we show that folding rates can also be predicted from amino acid sequences directly, without the need to know the native topology.
 Qiu, L. L., Pabit, S. A., Roitberg, A. E. & Hagen, S. J. (2002). Smaller and faster: the 20-residue Trp cage protein folds in 41s. J. Am. Chem. Soc. 124, 14548-14549.
 Kubelka, J., Eaton, W. A. & Hofrichter, J. (2003). Experimental tests of villin subdo-main folding simulations. J. Mol. Biol. 329, 625-630.
 Spector, S. & Raleigh, D. P. (1999). Submillisecond folding of the peripheral subunit-binding domain. J. Mol. Biol. 293, 763-768.
 Wang, T., Zhu, Y. & Gai, F. (2004). Folding of a three-helix bundle at the folding speed limit. J. Am. Chem. Soc. 108, 3694-3697.
 Gianni, S. et al. (2003). Unifying features in protein-folding mechanisms. Proc. Natl. Acad. Sci. USA 100, 13286-13291.
 Myers, J. K. & Oas, T. G. (2001). Preorganized secondary structure as an important determinant of fast protein folding. Nature Struct. Biol. 8, 552-558.
 Zhu, Y. et al. (2003). Ultrafast folding of ?3D: a de novo designed three-helix bundle protein. Proc. Natl. Acad. Sci. USA 100, 15486-15491.
 Ferguson, N., Capaldi, A. P., James, R., Kleanthous, C. & Radford, S. E. (1999). Rapid folding with and without populated intermediates in the homologous four-helix proteins Im7 and Im9. J. Mol. Biol. 286, 1597-1608.
 Burton, R. E., Huang, G. S., Daugherty, M. A., Fullbright, P. W. & Oas, T. G. (1996). Microsecond protein folding through a compact transition state. J. Mol. Biol. 263, 311-322.
 Munoz, V., Thompson, P. A., Hofrichter, J. & Eaton, W. A. (1997). Folding dynamics and mechanism of beta-hairpin formation. Nature 390, 196-199.
 Jager, M., Nguyen, H., Crane, J.C., Kelly, J.W. & Gruebele, M. (2001). The folding mechanism of a beta-sheet: the WW domain. J. Mol. Biol. 311, 373-393.
 Ferguson, N., Johnson, C.M., Macias, M., Oschkinat, H. & Fersht, A. R. (2001). Ultrafast folding of WW domains without structured aromatic clusters in the denatured state. Proc. Natl. Acad. Sci. USA 98, 13002-13007.
 Crane, J.C., Koepf, E.K., Kelly, J.W. & Gruebele, M. (2000). Mapping the transition state of the WW domain beta sheet. J. Mol. Biol. 298, 283-292.
 Grantcharova, V. P. & Baker, D. (1997). Folding dynamics of the src SH3 domain. Biochemistry 36, 15685-15692.
 Viguera, A., Martinez, J., Filimonov, V., Mateo, P. & Serrano, L. (1994). Thermodynamic and kinetic-analysis of the SH3 domain of spectrin shows a two-state folding transition. Biochemistry 33, 2142-2150.
 Plaxco, K. W., Guijarro, J. I., Morton, C. J., Pitkeathly, M., Campbell, I.D. & Dobson, C. M. (1998). The folding kinetics and thermodynamics of the Fyn-SH3 domain. Biochemistry 37, 2529-2537.
 Guijarro, J. I., Morton, C. J., Plaxco, K. W., Campbell, I. D. & Dobson, C. M. (1998). Folding kinetics of the SH3 domain of PI3 kinase by real-time NMR combined with optical spectroscopy. J. Mol. Biol. 276, 657-667.
 Guerois, R. & Serrano, L. (2000). The SH3-fold family: experimental evidence and prediction of variations in the folding pathways. J. Mol. Biol. 304, 967-982.
 Bowers, P. & Baker, D. unpublished results.
 Perl, D., Welker, C., Schindler, T., Schroder, K., Marahiel, M. A., Jaenicke, R. & Schmid, F. X. (1998). Conservation of rapid two-state folding in mesophilic, thermophilic and hyperthermophilic cold shock proteins. Nature Struct. Biol. 5, 229-235.
 Reid, K. L., Rodriguez, H. M., Hillier, B. J. & Gregoret, L. M. (1998). Stability and folding properties of a model beta-sheet protein, Escherichia coli CspA. Protein Sci. 7, 470-479.
 Schonbrunner, N., Koller, K.-P. & Kiefhaber, T. (1997). Folding of the disulfide-bonded beta-sheet protein tendamistat: rapid two-state folding without hydrophobic collapse. J. Mol. Biol. 268, 526-538.
 Naik, M. T., Chang, Y. C. & Huang, T. H. (2002). Folding kinetics of the lipoic acid-bearing domain of human mitochondrial branched chain alpha-ketoacid dehydrogenase complex. FEBS Lett. 530, 133-138.
 Viguera, A. R., Serrano, L. & Wilmanns, M. (1996). Different folding transition states may result in the same native structure. Nature Struct. Biol. 3, 874-880.
 Plaxco, K. W., Spitzfaden, C., Campbell, I. D. & Dobson, C. M. (1997). A comparison of the folding kinetics and thermodynamics of two homologous fibronectin type III modules. J. Mol. Biol. 270, 763-770.
 Clarke, J., Cota, E., Fowler, S. B. & Hamill, S. J. (1999). Folding studies of immunoglobulin-like beta-sandwich proteins suggest that they share a common folding pathway. Structure 7, 1145-1153.
 Roumestand, C., Boyer, M., Guignard, L., Barthe, P. & Royer, C. A. (2001). Characterization of the folding and unfolding reactions of a small beta-barrel protein of novel topology, the MTCP1 oncogene product P13. J. Mol. Biol. 312, 247-259.
 Kuhlman, B., Luisi, D. L., Evans, P. A. & Raleigh, D. P. (1998). Global analysis of the effects of temperature and denaturant on the folding and unfolding kinetics of the N-terminal domain of the protein L9. J. Mol. Biol. 284, 1661-1670.
 Kim, D. E., Fisher, C. & Baker, D. (2000). A breakdown of symmetry in the folding transition state of protein L. J. Mol. Biol. 298, 971-984.
 Jackson, S. E. & Fersht, A. R. (1991). Folding of chymotrypsin inhibitor-2: 1. Evidence for a two-state transition. Biochemistry 30, 10428-10435.
 Vallee-Belisle, A., Turcotte, J. F. & Michnick, S. W. (2004). raf RBD and ubiquitin proteins share similar folds, folding rates and mechanisms despite having unrelated amino acid sequences. Biochemistry 43, 8447-8458.
 van Nuland, N. A. J., Meijberg, W., Warner, J., Forge, V., Scheek, R. M., Robillard, G. T., & Dobson, C. M. (1998). Slow cooperative folding of a small globular protein Hpr. Biochemistry 37, 622-637.
 Aronsson, G., Brorsson, A.-C., Sahlman, L. & Jonsson, B.-H. (1997). Remarkably slow folding of a small protein. FEBS Lett. 411, 359-364.
 Silow, M. & Oliveberg, M. (1997). High-energy channeling in protein folding. Biochemistry 36, 7633-7637.
 Taddei, N., Chiti, F., Paoli, P., Fiaschi, T., Bucciantini, M., Stefani, M., Dobson, C.M. & Ramponi, G. (1999). Thermodynamics and kinetics of folding of common-type acylphosphatase: Comparison to the highly homologous muscle isoenzyme. Biochemistry 38, 2135-2142.
 Chiti, F., Taddei, N., White, P. M., Bucciantini, M., Magherini, F., Stefani, M. & Dobson, C. M. (1999). Mutational analysis of acylphosphatase suggests the importance of topology and contact order in protein folding. Nature Struct. Biol. 6, 1005-1009.
 Sato, S. & Raleigh, D. P. (2002). pH-dependent stability and folding kinetics of a protein with an unusual alpha-beta topology: The C-terminal domain of the ribosomal protein L9. J. Mol. Biol. 318, 571-582.
 Hedberg, L. & Oliveberg, M. (2004). Scattered Hammond plots reveal secondary level of site-specific information in protein folding: phi'(beta++). Proc. Natl. Acad. Sci. USA 101, 7606-7611.
 Main, E. R. G., Fulton, K. F. & Jackson, S. E. (1999). Folding pathway of FKBP12 and characterization of the transition state. J. Mol. Biol. 291, 429-444.
 Mayor, U., Johnson, C. M., Daggett, V. & Fersht, A. R. (2000). Protein folding and unfolding in microseconds to nanoseconds by experiment and simulation. Proc. Natl. Acad. Sci. USA 97, 13518-13522.
 Laurents, D. V., Corrales, S., Elias-Arnanz, M., Sevilla, P., Rico, M. & Padmanabhan,(2000). Folding Kinetics of Phage 434 Cro Protein. Biochemistry 39, 13963-13973.
 Jemth, P., Gianni, S., Day, R., Li, B., Johnson, C. M., Daggett, V. & Fersht, A. R. (2004). Demonstration of a low-energy on-pathway intermediate in a fast-folding protein by kinetics, protein engineering, and simulation. Proc. Natl. Acad. Sci. USA 101, 6450-6455.
 Schreiber, G. & Fersht, A. R. (1993). The refolding of cis-peptidylprolyl and trans-peptidylprolyl isomers of barstar. Biochemistry 32, 11195-11203.
 Tang, K. S., Guralnick, B. J., Wang, W. K., Fersht, A. R. & Itzhaki, L. S. (1999). Stability and folding of the tumour suppressor protein p16. J. Mol. Biol. 285, 1869-1886.
 Parker, M. J., Dempsey, C. E., Lorch, M. & Clarke, A. R. (1997). Acquisition of native beta-strand topology during the rapid collapse phase of protein folding. Biochemistry 36, 13396-13405.
 Reader, J. S., Van Nuland, N. A. J., Thompson, G. S., Ferguson, S. J., Dobson, C.M. & Radford, S. E. (2001). A partially folded intermediate species of the beta-sheet protein apo-pseudoazurin is trapped during proline-limited folding. Protein Sci. 10, 1216-1224.
 Dalessio, P. M. & Ropson, I. J. (2000). beta-sheet proteins with nearly identical structures have different folding intermediates. Biochemistry 39, 860-871.
 Burns, L. L., Dalessio, P. M. & Ropson, I. J. (1998). Folding mechanism of three structurally similar beta-sheet proteins. Proteins 33, 107-118.
 Liu, C. S., Gaspar, J. A., Wong, H. J. & Meiering, E. M. (2002). Conserved and nonconserved features of the folding pathway of hisactophilin, a beta-trefoil protein. Protein Sci. 11, 669-679.
 Kuwajima, K., Yamaya, H. & Sugai, S. (1996). The burst-phase intermediate in the refolding of beta-lactoglobulin studied by stopped-flow circular dichroism and absorption spectroscopy. J. Mol. Biol. 264, 806-822.
 Enoki, S. & Kuwajima, K. unpublished results.
 McCallister, E. L., Alm, E. & Baker, D. (2000). Critical role of beta-hairpin formation in protein G folding. Nature Struct. Biol. 7, 669-673.
 Khorasanizadeh, S., Peters, I. D. & Roder, H. (1996). Evidence for a three-state model of protein folding from kinetic analysis of ubiquitin variants with altered core residues. Nature Struct. Biol. 3, 193-205.
 Calloni, G., et al. (2003). Comparison of the folding processes of distantly related proteins: Importantce of hydrophobic content in folding. J. Mol. Biol. 330, 577-591.
 Schymkowitz, J. W. H., Rousseau, F., Irvine, L. R. & Itzhaki, L. S. (2000). The folding pathway of the cell-cycle regulatory protein p13suc1: clues for the mechanism of domain swapping. Struct. Fold. Des. 8, 89-100.
 Munoz, V., Lopez, E. M., Jager, M. & Serrano, L. (1994). Kinetic characterization of the chemotactic protein from Escherichia coli, CheY: Kinetic analysis of the inverse hydrophobic effect. Biochemistry 33, 5858-5866.
 Kiefhaber, T. (1995). Kinetic traps in lysozyme folding. Proc. Natl. Acad. Sci. USA 92, 9029-9033.
 Maki, K., Cheng, H., Dolgikh, D. A., Shastry, M. C. R. & Roder, H. (2004). Early events during folding of wild-type staphylococcal nuclease and a single-tryptophan variant studied by ultrarapid mixing. J. Mol. Biol. 338, 383-400.
 Saeki, K., Arai, M., Yoda, T., Nakao, M. & Kuwajima, K. (2004). Localized nature of the transition-state structure in goat alpha-lactalbumin folding. J. Mol. Biol. 341, 589-604.
 Golbik, R., Zahn, R., Harding, S. E. & Fersht, A. R. (1998). Thermodynamic stability and folding of GroEL minichaperones. J. Mol. Biol. 276, 505-515.
 Parker, M. J. & Marqusee, S. (1999). The cooperativity of burst phase reactions explored. J. Mol. Biol. 293, 1195-1210.
 Parker, M. J., Spencer, J. & Clarke, A. R. (1995). An integrated kinetic analysis of intermediates and transition states in protein folding reactions. J. Mol. Biol. 253, 771-786.
 Parker, M. J., Sessions, R. B., Badcoe, I. G. & Clarke, A. R. (1996). The development of tertiary interactions during the folding of a large protein. Folding & Design 1, 145-156.
 Santos, J. et al. (2004). Folding of an abridged beta-lactamase. Biochemistry 43, 1715-1723.
 Ogasahara, K. & Yutani, K. (1994). Unfolding-refolding kinetics of the tryptophan synthase alpha subunit by CD and fluorescence measurements. J. Mol. Biol. 236, 1227-1240.
 Ogasahara, K. & Yutani, K. (1990). An early immunoreactive folding intermediate of the tryptophan synthase beta 2 subunit is a "molten globule". FEBS Lett. 263, 51-56.
 Teilum, K., Maki, K., Kragelund, B. B., Poulsen, F. M. & Roder, H. (2002). Early kinetic intermediate in the folding of acyl-CoA binding protein detected by fluorescence labeling and ultrarapid mixing. Proc. Natl Acad. Sci. USA, 99, 9807–9812.
 Raschke, T. M. & Marqusee, S. (1997). The kinetic folding intermediate of ribonuclease H resembles the acid molten globule and partially unfolded molecules detected under native conditions. Nature Struct. Biol. 4, 298–304.
 Finke, J. M. & Jennings, P. A. (2002). Interleukin-1b folding between pH 5 and 7: experimental evidence for three-state folding behavior and robust transition state positions late in folding. Biochemistry, 41, 15056–15067.
 Ikura, T. & Fersht, A. R. (2001). [Folding mechanism and folding rate]. Tanpakushitsu Kakusan Koso, 46, 1553–1559.
 Parker, M. J. & Marqusee, S. (1999). The cooperativity of burst phase reactions explored. J. Mol. Biol. 293, 1195–1210.