Analysis of vocal sounds in asthmatic patients
Abstract
Around 334 million people have asthma worldwide. Asthma is an inflammatory disease of the airways which causes cough, breathlessness, chest tightness, and other peculiar sounds during breathing. The golden standard test spirometry is used to diagnose and monitor asthma. Spirometry is a lung function test that measures the time and volume of air a person can exhale after a deep inhalation. In general, patients repeat the test multiple times to get accurate readings, making it very time-consuming, and strenuous, especially for children and older people. A spirometer is also an expensive and bulky device that is not suitable for a home monitoring setup. Hence, a need for an easy and fast method exists. Sound-based analysis can be one such method. The motivation behind using sounds for monitoring and diagnosis of asthma originates from the sound production mechanism. In literature, non-speech sounds, namely, cough and breath recorded at the chest, have been explored. However, the analysis of speech and non-speech sounds recorded at the mouth is least investigated. Therefore, the work done in this thesis addresses this problem. For the analysis, we started with two tasks, namely classification and spirometry prediction for each sound category. Therefore, the thesis is divided into two parts.\\
Analysis of speech sounds
For speech sound analysis, classification between the asthmatic and healthy subjects and spirometry prediction tasks have been performed. Results of the classification and spirometry prediction task show that \textipa{/oU/} (as in 'Home') is the best for the classification task, whereas \textipa{/i:/} (as in 'Meet') is best for the spirometry prediction. Results of the classification task suggest that Mel-frequency cepstral coefficients (MFCC) statistics carry maximum information for the discrimination in the case of all speech sounds considered in this work. Spirometry variables prediction task results show that the low frequency carries more information in best-performing sound \textipa{/i:/}.\\
Analysis of non-speech sounds
Classification performed with non-speech sounds, cough, and breath, indicates that MFCC's are best, but interestingly, static MFCC coefficients are more informative than the velocity and acceleration coefficients. Breath performs the best for the classification task in the non-speech sound group. Further analysis of breath signal shows that discriminative information for classification is not uniform in the entire breath signal. Interestingly, the middle 50\% of the breath signal carries maximum information for the classification. To extract the middle 50\% of a breath, prior knowledge of breath boundaries is required. Therefore, we developed an unsupervised breath segmentation algorithm using dynamic programming. Classification results using predicted boundaries are found to be on par with the ground truth boundaries. Comparison of speech sounds and breath for the classification tasks shows that breath sound outperformed speech sounds. Experiments conducting for spirometry prediction tasks using non-speech sound indicate that breath performance is better than cough. In the spirometry prediction task, speech sounds outperformed the non-speech sounds.
As asthma is not curable, patients needs to be on continuous medication known as bronchodilaotrs
to reduce inflammation in the airways. But what kind of changes introduced due to airway change is not known. A linear discriminant-based analysis has been carried out to determine what kind of spectral changes occur in an asthmatic patient's sound before and after taking a bronchodilator. For this task, breath sounds have been chosen. It has been observed that frequency bands 400Hz-500Hz and 1480Hz -1900Hz are more sensitive to obstruction change in breath sound.