To the top

Page Manager: Webmaster
Last update: 9/11/2012 3:13 PM

Tell a friend about this page
Print version

Comparison of Automated M… - University of Gothenburg, Sweden Till startsida
Sitemap
To content Read more about how we use cookies on gu.se

Comparison of Automated Methods for Vowel Segmentation and Extraction of Acoustic Variables

Poster
Authors Dirk B. den Ouden
Angelica Hutchinson
Kyrana Tsapkini
Charalambos Themistocleous
Published in Clinical Aphasiology Conference, CAC 2018, Austin, Texas USA.
Publication year 2018
Published at Department of Swedish
Language en
Links https://www.regonline.com/builder/s...
Subject categories Phonetics, Linguistics, Neurology

Abstract

Introduction: Primary Progressive Aphasia (PPA) is a neurodegenerative syndrome in which linguistic abilities become gradually impaired. There are three primary variants of PPA: the non-fluent agrammatic PPA, the fluent type semantic PPA, and the logopenic PPA, which is also considered an atypical form of Alzheimer’s disease (Mesulam et al., 1982; Gorno-Tempini et al., 2011). Along with the three main variants, a fourth variant has been proposed, a non-fluent apraxia of speech (AOS), though this is currently the subject of an open debate (e.g., Duffy et al., 2017; Henry et al., 2013). According to sophisticated criteria established a few years ago, PPA subtyping for a given patient presented in clinic requires clinical, neuropsychological, and imaging information (Gorno-Tempini et al., 2011). Nevertheless, quantifying the decline of linguistic abilities and subtyping the variants of PPA manually is both hard and laborious, so there is great demand for algorithms that subtype a given patient automatically. Picture description samples of connected speech and random forests techniques have been used for this purpose (de Aguiar et al., 2017; Wilson et al., 2010, Fraser et al. 2013, 2014). In the present study, we compared existing models and we propose a new one. Aims: In this study, we provide an automated classification model of PPA variants trained on known morphological and acoustic predictors and on predictors related to the clinical and linguistic profile of individuals with PPA (e.g., Mack et al., 2015; Gorno-Tempini et al., 2011; Wilson et al., 2010). Method: Speech materials for this study come from the Transcranial Direct Current Stimulation for Primary Progressive Aphasia study at Johns Hopkins University. Twenty-six individuals with PPA (Mean(SD) age = 68.6 (7.8) years, Mean(SD) education = 16.1 (2.9) years) participated in this study. PPA participants were diagnosed based on the established consensus criteria (Gorno-Tempini et al., 2011), i.e., imaging, clinical, and neuropsychological examination by trained neurologists. Individuals with PPA included non-fluent with AOS (N=5), non fluent without AOS (N=7), logopenic (N=8), and semantic (N=6) variants. Recordings of the Cookie Theft picture description from the Boston Diagnostic Aphasia Examination (BDAE) were computationally analyzed. All speech productions were automatically transcribed and segmented using an end-to-end speech-to-transcription platform. From the speech signals, we measured morphological and acoustic predictors, including vowel formants F1 ... F3, measured at 15%, 50%, and 75% of vowel’s duration, vowel duration, fundamental frequency, and pause duration. The analysis and the statistics were conducted using Python and R programming languages (R Core Team, 2017; Rossum, 1995). Three different machine learning algorithms: C5.0 decision trees, Classification and Regression Trees (CART) and random forests were trained on the predictors (Breiman, 2001; Quinlan, 1993; Hastie et al., 2009). All models were trained on the 80% of the speakers (training set), with 3-fold cross-validation. All predictor variables were centered and scaled. C5.0 was trained with winnowing and without winnowing. (Winnowing facilitates the automatic pre-selection of the predictors that are used in the decision tree.) After the training we evaluated the trained models on the unknown dataset, namely the 20% of the speakers (evaluation set). Results: C5.0 provided 86% (95% CI[81, 88], kappa = 0.76) and Random Forests 85% (95% CI[81, 88], kappa = 0.76) classification accuracy on the test data; CART provided the lowest overall classification accuracy. Overall, C5.0 outperformed both the random forests and CART, with high classification accuracy on unknown data. Non-fluent PPA with AOS was correctly predicted by both C5.0 and random forests. Discussion: The C5.0 classification model provides support for the known predictors employed in the literature. Also, it provides some objective ways to distinguish the presence of AOS in PPA and corroborate research on classification of AOS using acoustic properties especially those related to vowel production (Den Ouden et al. 2017). However, given the low number of participants employed in this study, further research is required, with a larger number of participants. Nevertheless, the proposed methods employed here constitute a promising step towards a computational differential diagnostic tool of PPA that is easy to use, quick and accurate.

Page Manager: Webmaster|Last update: 9/11/2012
Share:

The University of Gothenburg uses cookies to provide you with the best possible user experience. By continuing on this website, you approve of our use of cookies.  What are cookies?