Article révisé par les pairs
Résumé : Recent classifier combination frameworks have proposed several ways of weakening a learning set and have shown that these weakening methods improve prediction accuracy. In the present paper we focus on learning set sampling (Breiman's bagging) and random feature subset selections (Bay's Multiple Feature Subsets). We present a combination scheme labeled 'Bagfs', in which new learning sets are generated on the basis of both bootstrap replicates and selected feature subsets. The performances of the three methods (Bagging, MFS and Bagfs) are assessed by means of a decision-tree inducer (C4.5) and a majority voting rule. In addition, we also study whether the way in which weak classifiers are created has a significant influence on the performance of their combination. To answer this question, we undertook the strict application of the Cochran Q test. This test enabled us to compare the three weakening methods together on a given database, and to conclude whether or not these methods differ significantly. We also used the McNemar test to compare algorithms pair by pair. The first results, obtained on 14 conventional databases, show that on average, Bagfs exhibits the best agreement between prediction and supervision. The Cochran Q test indicated that the weak classifiers so created significantly influenced combination performance in the case of at least 4 of the 14 databases analyzed. © Springer-Verlag Berlin Heidelberg 2000.