Article révisé par les pairs
Résumé : Breast cancer is the second most common cancer after lung cancer. So far, in clinical practice, most cancer parameters originating from histopathology rely on the visualization by a pathologist of microscopic structures observed in stained tissue sections, including immunohistochemistry markers. Fourier transform infrared spectroscopy (FTIR) spectroscopy provides a biochemical fingerprint of a biopsy sample and, together with advanced data analysis techniques, can accurately classify cell types. Yet, one of the challenges when dealing with FTIR imaging is the slow recording of the data. One cm2 tissue section requires several hours of image recording. We show in the present paper that 2D covariance analysis singles out only a few wavenumbers where both variance and covariance are large. Simple models could be built using 4 wavenumbers to identify the 4 main cell types present in breast cancer tissue sections. Decision trees provide particularly simple models to reach discrimination between the 4 cell types. The robustness of these simple decision-tree models were challenged with FTIR spectral data obtained using different recording conditions. One test set was recorded by transflection on tissue sections in the presence of paraffin while the training set was obtained on dewaxed tissue sections by transmission. Furthermore, the test set was collected with a different brand of FTIR microscope and a different pixel size. Despite the different recording conditions, separating extracellular matrix (ECM) from carcinoma spectra was 100% successful, underlying the robustness of this univariate model and the utility of covariance analysis for revealing efficient wavenumbers. We suggest that 2D covariance maps using the full spectral range could be most useful to select the interesting wavenumbers and achieve very fast data acquisition on quantum cascade laser infrared imaging microscopes.