Voice Pathology Assessment Using X-Vectors Approach

Katarzyna Kotarba; Michał Kotarba

doi:10.21008/j.0860-6897.2021.1.08

System Informacji Naukowej Politechniki Poznańskiej

PL EN

Strona główna / Publikacje / Voice Pathology Assessment Using X-Vectors Approach

Zgłoś uwagę

Artykuł

Pobierz plik Pobierz BibTeX

Tytuł

Voice Pathology Assessment Using X-Vectors Approach

Autorzy

Rok publikacji

2021

Opublikowano w

Vibrations in Physical Systems

Rocznik: 2021 | Tom: vol. 32 | Numer: no. 1

Typ artykułu

artykuł naukowy

Język publikacji

angielski

Słowa kluczowe

EN

x-vectors
speaker embeddings
voice pathology
MFCC
GFCC

Streszczenie

EN Voice pathology assessment using sustained vowels has proven to be effective and reliable. However, only a few studies regarding detection of pathological speech based on continuous speech are available. In this study we evaluate the usefulness of various regression models trained on continuous speech recordings from Saarbruecken Voice Database in the detection of voice pathologies. The recordings were used for extraction of speaker embeddings called x-vectors based on mel-frequency cepstral coefficients and gammatone frequency cepstral coefficients. Since the dataset used in this study is imbalanced, various over- and undersampling techniques were applied to the training set to ensure robustness of models’ decision boundaries. The models were trained on both imbalanced and resampled training sets using 5-fold cross-validation. The best results were obtained for Multi Layer Perceptron trained on GFCC-based x-vectors, achieving accuracy of 0.8184, F1-score of 0.8212, and ROC AUC score of 0.8810 for the testing set.

Strony (od-do)

2021108-1 - 2021108-8

DOI

10.21008/j.0860-6897.2021.1.08

URL

https://vibsys.put.poznan.pl/_journal/2021-32-1/articles/vps_2021108.pdf

Uwagi

article number: 2021108

Typ licencji

CC BY (uznanie autorstwa)

Pełny tekst artykułu

Pobierz plik