|Résumé :||Molecular Biology has allowed the characterization and manipulation of the molecules of life in the wet lab. Also the structures of those macromolecules are being continuously elucidated. During the last decades of the past century, there was an increasing interest to study how the different genes are organized into different organisms (‘genomes’) and how those genes are expressed into proteins to achieve their functions. Currently the sequences for many genes over several genomes have been determined. In parallel, the efforts to have the structure of the proteins coded by those genes go on. However it is experimentally much harder to obtain the structure of a protein, rather than just its sequence. For this reason, the number of protein structures available in databases is an order of magnitude or so lower than protein sequences. Furthermore, in order to understand how living organisms work at molecular level we need the information about the interaction of those proteins. Elucidating the structure of protein macromolecular assemblies is still more difficult. To that end, the use of computers to predict the structure of these complexes has gained interest over the last decades.
The main subject of this thesis is the evaluation of current available computational methods to predict protein – protein interactions and build an atomic model of the complex. The core of the thesis is the evaluation protocol I have developed at Service de Conformation des Macromolécules Biologiques et de Bioinformatique, Université Libre de Bruxelles, and its computer implementation. This method has been massively used to evaluate the results on blind protein – protein interaction prediction in the context of the world-wide experiment CAPRI, which have been thoroughly reviewed in several publications [1-3]. In this experiment the structure of a protein complex (‘the target’) had to be modeled starting from the coordinates of the isolated molecules, prior to the release of the structure of the complex (this is commonly referred as ‘docking’).
The assessment protocol let us compute some parameters to rank docking models according to their quality, into 3 main categories: ‘Highly Accurate’, ‘Medium Accurate’, ‘Acceptable’ and ‘Incorrect’. The efficiency of our evaluation and ranking is clearly shown, even for borderline cases between categories. The correlation of the ranking parameters is analyzed further. In the same section where the evaluation protocol is presented, the ranking participants give to their predictions is also studied, since often, good solutions are not easily recognized among the pool of computer generated decoys.
An overview of the CAPRI results made per target structure and per participant regarding the computational method they used and the difficulty of the complex. Also in CAPRI there is a new ongoing experiment about scoring previously and anonymously generated models by other participants (the ‘Scoring’ experiment). Its promising results are also analyzed, in respect of the original CAPRI experiment. The Scoring experiment was a step towards the use of combine methods to predict the structure of protein – protein complexes. We discuss here its possible application to predict the structure of protein complexes, from a clustering study on the different results.
In the last chapter of the thesis, I present the preliminary results of an ongoing study on the conformational changes in protein structures upon complexation, as those rearrangements pose serious limitations to current computational methods predicting the structure protein complexes. Protein structures are classified according to the magnitude of its conformational re-arrangement and the involvement of interfaces and particular secondary structure elements is discussed. At the end of the chapter, some guidelines and future work is proposed to complete the survey.