Thèse de doctorat
Résumé : The study concerns the analysis of vocal cycle length perturbations in normophonic and dysphonic speakers.A method for tracking cycle lengths in voiced speech is proposed. The speech cycles are detected via the saliences of the speech signal samples, defined as the length of the temporal interval over which a sample is a maximum. The tracking of the cycle lengths is based on a dynamic programming algorithm that does not request that the signal is locally periodic and the average period length known a priori.The method is validated on a corpus of synthetic stimuli. The results show a good agreement between the extracted and the synthetic reference length time series. The method is able to track accurately low-frequency modulations and ast cycle-to-cycle perturbations of up to 10% and 4% respectively over the whole range of vocal frequencies. Robustness with regard to the background noise has lso been tested. The results indicate that the tracking is reliable for signal-to-noise ratios higher than 15dB.A method for analyzing the size of the cycle length perturbations as well as their frequency is proposed. The cycle length time series is decomposed into a sum of oscillating components by empirical mode decomposition the instantaneous envelopes and frequencies of which are obtained via AM-FM decomposition. Based on their average instantaneous frequencies, the empirical modes are then assigned to four categories (declination, physiological tremor, neurological tremor as well as cycle length jitter) and added within each. The within-category size of the cycle length perturbations is estimated via the standard deviation of the empirical mode sum divided by the average cycle length. The neurological tremor modulation frequency and bandwidth are obtained via the instantaneous frequencies and amplitudes of empirical modes in the neurological tremor category and summarized via a weighted instantaneous frequency probability density, compensating for the effects of mode mixing.The method is applied to two corpora of vowels comprising 123 and 74 control and 456 and 205 Parkinson speaker recordings respectively. The results indicate that the neurological tremor modulation depth is statistically significantly higher for female Parkinson speakers than for female control speakers. Neurological tremor frequency differs statistically significantly between male and female speakers and increases statistically significantly for the pooled Parkinson speakers compared to the pooled control speakers. Finally, the average vocal frequency increases for male Parkinson speakers and decreases for female Parkinson speakers, compared to the control speakers.