Résumé : In this study we use an unsupervised clustering algorithm called Vector Quantization Principal Component Analysis (VQPCA) to characterize Moderate and Intense Low-oxygen Dilution (MILD) combustion. Two direct numerical simulation (DNS) datasets of non-premixed MILD combustion are used and they differ primarily in the oxygen (O2) dilution levels. Maximum volume fraction of O2 for Case-AZ1 is 3.5% (lower dilution) and for Case-BZ1 it is 2.0% (higher dilution). DNS grid points with thermo-chemical information are fed to VQPCA which partitions them into a few clusters. Qualitative and quantitative comparison between VQPCA clusters and structures of physical variables such as mixture-fraction, Z, and heat-release rate, HRR, are used to identify variable for system-level characterization. While qualitative comparison is based on visualization, quantitative comparison is newly developed in this study and is based on Boolean logic. Subsequently, a new physics-based method is proposed to rank important features (or variables) for each cluster. This is useful for a fine-grained characterization of the system. The method is based on calculating the Hellinger distance between the probability density functions (PDF) of each feature conditioned over a cluster and the region outside it. The outcome of physics-based feature ranking method is compared with several feature selection methods based on PCA. Finally, we also demonstrate that through careful selection of features we can tailor clustering outcomes in systems with high dilution, such as like BZ1, where initially no system-level variable was recognized.