par Kontos, Kevin ;Bontempi, Gianluca
Référence Communications in computer and information science, 1, 13, page (273-287)
Publication Publié, 2008
Article révisé par les pairs
Résumé : Gaussian graphical models are widely used to tackle the important and challenging problem of inferring genetic regulatory networks from expression data. These models have gained much attention as they encode full conditional relationships between variables, i.e. genes. Unfortunately, microarray data are characterized by a low number of samples compared to the number of genes. Hence, classical approaches to estimate the full joint distribution cannot be applied. Recently, limited-order partial correlation approaches have been proposed to circumvent this problem. It has been shown both theoretically and experimentally that such graphs provide accurate approximations of the full conditional independence structure between the variables thanks to the sparsity of genetic networks. Alas, computing limited-order partial correlation coefficients for large networks, even for small order values, is computationally expensive, and often even intractable. Moreover, problems deriving from multiple statistical testing arise, and one should expect that most of the edges are removed. We propose a procedure to tackle both problems by reducing the dimensionality of the inference tasks. By adopting a screening procedure, we iteratively build nested graphs by discarding the less relevant edges. Moreover, by conditioning only on relevant variables, we diminish the problems related tomultiple testing. This procedure allows us to faster infer limited-order partial correlation graphs and to consider higher order values, increasing the accuracy of the inferred graph. The effectiveness of the proposed procedure is demonstrated on simulated data. © Springer-Verlag Berlin Heidelberg 2008.