par Piron, Anthony ;Szymczak, Florian ;De Oliveira Alvelos, Maria ;Defrance, Matthieu ;Lenaerts, Tom ;Eizirik, Decio L. ;Cnop, Miriam
Référence BioRxiv
Publication Publié, 2022-09-02
Référence BioRxiv
Publication Publié, 2022-09-02
Article sans comité de lecture
Résumé : | Motivation. High throughput omics technologies have generated a wealth of large protein, gene and transcript datasets that have exacerbated the need for new methods to analyse and compare big datasets. Rank-rank hypergeometric overlap is an important threshold-free method to combine and visualize two ranked lists of P-values or fold-changes, usually from differential gene expression analyses. Here, we introduce a new rank-rank hypergeometric overlap-based method aimed at both gene level and alternative splicing analyses at transcript or exon level, hitherto unreachable as transcript numbers are an order of magnitude larger than gene numbers.Results. We tested the tool on synthetic and real datasets at gene and transcript levels to detect correlation and anti-correlation patterns and found it to be fast and accurate, even on very large datasets thanks to an evolutionary algorithm based minimal P-value search. The tool comes with a ready-to-use permutation scheme allowing the computation of adjusted P-values at low time cost. Additionally, the package is a drop-in replacement to previous packages as a compatibility mode is included, allowing to re-run older studies with close to no change to existing pipelines. RedRibbon holds the promise to accurately extricate detailed information from large analyses.Availability. RNA-sequencing datasets are available through the Gene Expression Omnibus (GEO) portal with accession numbers GSE159984, GSE133218, GSE137136, GSE98485, GSE148058 and GSE108413. The C libraries and R package code are open to the community with a permissive licence (GPL3) and available for download from GitHub https://github.com/antpiron/ale, https://github.com/antpiron/cRedRibbon and https://github.com/antpiron/RedRibbon. |