Article révisé par les pairs
Résumé : The paper proposes a technique for speeding up the search of the optimal set of features in classification problems where the input variables are discrete or nominal. The approach is based on the definition of an upper bound on the mutual information between the target and a set of d input variables. This bound is derived as a function of the mutual information of its subsets of d - 1 cardinality. The rationale of the algorithm is to proceed to evaluate the mutual information of a subset only if the respective upper bound is sufficiently promising. The computation of the upper bound can thus be seen as a pre-estimation of a subset. We show that the principle of pre-estimating allows to explore a much higher number of combinations of inputs than the classical algorithm of forward selection by preserving the same computational complexity. Some preliminary results showing the effectiveness of the proposed technique with respect to the classical forward search are reported.