Article révisé par les pairs
Résumé : Fourier-transform infrared spectroscopy is a method of choice for the experimental determination of protein secondary structure. Numerous approaches have been developed during the past 15 years. A critical parameter that has not been taken into account systematically is the selection of the wavenumbers used for building the mathematical models used for structure prediction. The high quality of the current Fourier-transform infrared spectrometers makes the absorbance at every single wavenumber a valid and almost noiseless type of information. We address here the question of the amount of independent information present in the infrared spectra of proteins for the prediction of the different secondary structure contents. It appears that, at most, the absorbance at three distinct frequencies of the spectra contain all the nonredundant information that can be related to one secondary structure content. The ascending stepwise method proposed here identifies the relevance of each wavenumber of the infrared spectrum for the prediction of a given secondary structure and yields a particularly simple method for computing the secondary structure content. Using the 50-protein database built beforehand to contain as little fold redundancy as possible, the standard error of prediction in cross-validation is 5.5% for the alpha-helix, 6.6% for the beta-sheet, and 3.4% for the beta-turn.