Article révisé par les pairs
Résumé : A method is developed to compute backbone tertiary folds from the amino acid sequence. In this method, the number of degrees of freedom is drastically reduced by neglecting side-chain flexibility, and by describing backbone conformations as combinations of only seven structural states. These are characterized by single values of the dihedral angles phi, psi and omega, representing allowed conformations of the isolated dipeptide. We show that this restrictive model is none the less capable of describing native backbones to within acceptable deviations. Using our backbone description, potentials of mean force are derived from a database of known protein structures, based on statistical influences of single residues and residue pairs on the conformational states in their vicinity along the chain. This yields the force-field component due to local interactions, which is then used to predict lowest-energy conformations from any given amino acid sequence. The prediction algorithm does not require searching conformational space and is therefore extremely fast. Another important asset of our method is that it is able to compute not only the minimum energy conformation, but any number of lowest energy structures, whose relative preferences can be determined from the corresponding computed energy values. The performance of our procedure is tested on short peptides that are likely to be stabilized by local interactions. These include several helical structures and a hexapeptide with a beta-bend conformation, corresponding to peptides shown to have relatively well-defined conformations in aqueous solution, and to protein segments believed to adopt their native conformation early during folding. In addition, several flexible peptides are analysed. Except for the problems encountered in predicting observed disulphide bridges in two of the flexible peptides, and in a somewhat larger fragment comprising residues 30 to 51 of bovine trypsin inhibitor, prediction results compare very favourably with experimental data. Potential applications of our procedure to protein modelling and its extension to protein folding are discussed.