SOLart: a structure-based method to predict protein solubility and aggregation

Hou, Qingzhen; Kwasigroch, Jean-Marc; Rooman, Marianne; Pucci, Fabrizio

doi:doi/10.1093/bioinformatics/btz773

Citer

SOLart: a structure-based method to predict protein solubility and aggregation

par Hou, Qingzhen

;Kwasigroch, Jean-Marc

;Rooman, Marianne

;Pucci, Fabrizio

Référence Bioinformatics, 36, 5, page (1445-1452)
Publication Publié, 2020-03-01

Article révisé par les pairs

Résumé :

Abstract Motivation The solubility of a protein is often decisive for its proper functioning. Lack of solubility is a major bottleneck in high-throughput structural genomic studies and in high-concentration protein production, and the formation of protein aggregates causes a wide variety of diseases. Since solubility measurements are time-consuming and expensive, there is a strong need for solubility prediction tools. Results We have recently introduced solubility-dependent distance potentials that are able to unravel the role of residue–residue interactions in promoting or decreasing protein solubility. Here, we extended their construction by defining solubility-dependent potentials based on backbone torsion angles and solvent accessibility, and integrated them, together with other structure- and sequence-based features, into a random forest model trained on a set of Escherichia coli proteins with experimental structures and solubility values. We thus obtained the SOLart protein solubility predictor, whose most informative features turned out to be folding free energy differences computed from our solubility-dependent statistical potentials. SOLart performances are very good, with a Pearson correlation coefficient between experimental and predicted solubility values of almost 0.7 both in cross-validation on the training dataset and in an independent set of Saccharomyces cerevisiae proteins. On test sets of modeled structures, only a limited drop in performance is observed. SOLart can thus be used with both high-resolution and low-resolution structures, and clearly outperforms state-of-art solubility predictors. It is available through a user-friendly webserver, which is easy to use by non-expert scientists. Availability and implementation The SOLart webserver is freely available at http://babylone.ulb.ac.be/SOLART/. Supplementary information Supplementary data are available at Bioinformatics online.

Référencement	Visibilité	Pérennité	Facilité
Les publications encodées constituent la bibliographie académique de l'Université.	Les documents déposés sont indexés par les moteurs de recherche (Google Scholar,…).	Les documents déposés en open-access sont archivés au sein du réseau de préservation SAFE-PLN (www.safepln.org).	Les listes de publications sont compatibles avec le CV-ULB, le FNRS et accessibles sur le web.

SOLart: a structure-based method to predict protein solubility and aggregation

Documents en relation

DI-fusion