par Meyer, Patrick E.
;Bontempi, Gianluca 
Référence Biological Knowledge Discovery Handbook: Preprocessing, Mining and Postprocessing of Biological Data, wiley, page (399-419)
Publication Publié, 2014-01
;Bontempi, Gianluca 
Référence Biological Knowledge Discovery Handbook: Preprocessing, Mining and Postprocessing of Biological Data, wiley, page (399-419)
Publication Publié, 2014-01
Partie d'ouvrage collectif
| Résumé : | This chapter introduces the curse of dimensionality, and focuses on widely used variable exploration strategies. The chapter also introduces the information-theoretic framework and recalls variable selection techniques which have been proposed in the literature. It discusses estimation techniques that can be used for implementing the selection strategies on the basis of observed data. The three sequential heuristic searches introduced, namely forward, backward, and bidirectional selection, share with the two mutual information estimators, the empirical and the Gaussian, a low computational cost coupled with a growing literature of good empirical results. Most of the selection criteria presented here use combinations of only bi- and trivariate probability distributions in order to reduce the effect of the curse of dimensionality. Finally, the chapter introduces the notions of relevance, redundancy, and synergy in order to explain and compare each method’s ability to combine those bi- and trivariate distributions in an efficient setting. |



