SIS PUT | Speaker recognition based on transcoded speech for human-machine interfaces

Scientific Information System of the Poznań University of Technology

PL EN

Main page / Dissertations / Speaker recognition based on transcoded speech for human-machine interfaces

Submit a comment

Dissertation

Download BibTeX

Title

Speaker recognition based on transcoded speech for human-machine interfaces

Authors

Radosław Sebastian Weychan (WI) ^{[ 1 ][ D ]}

^{[ 1 ]} Instytut Automatyki i Robotyki, Wydział Informatyki, Politechnika Poznańska | ^{[ D ]} phd student

Promoter

Adam Dąbrowski (WI) ^{[ 1 ][ P ]}

^{[ 1 ]} Instytut Automatyki i Robotyki, Wydział Informatyki, Politechnika Poznańska | ^{[ P ]} employee

Supporting promoter

Tomasz Marciniak (WI) ^{[ 1 ][ P ]}

^{[ 1 ]} Instytut Automatyki i Robotyki, Wydział Informatyki, Politechnika Poznańska | ^{[ P ]} employee

Reviewers

Title variant

PL Rozpoznawanie mówcy na podstawie transkodowanej mowy do interfejsów człowiek-maszyna

Language

english

Keywords

Speaker recognition
lossy encoding
GSM
gaussian mixture models
fixed-point arithmetics

Rozpoznawanie mówcy
kodowanie stratne
GSM
mieszaniny gaussa
arytmetyka stałoprzecinkowa

Abstract

EN This dissertation presents results of research related to recognition of speakers from short utterances in application to automation systems. The transmission of speech by GSM and internet network was also considered. The aim of presented investigations was the analysis of the opportunity to extend speech controlled human-machine interface (HMI) with the functionality of speaker identification. The proposed methods of the use of voice activity algorithms, encoding and even GSM encoder type detection, and also the use of encoder-related speaker model, resulted in significant increase of the recognition performance. Additionaly, the hardware implementation was provided with the use of ARM processor, and fixed-point digital signal processor. Proposed improvements resulted in increase of recognition accuracy, especially for fixed-point implementation. This allowed also for the reduction of acquisition and processing resolution without reduction of recognition accuracy.

PL Rozprawa prezentuje rezultaty badań dotyczących rozpoznawania mówcy z krótkich wypowiedzi obniżonej jakości w zastosowaniach automatyki, z uwzględnieniem transmisji mowy przez sieć GSM oraz internet. Celem badań była analiza możliwości rozszerzenia, sterowanych za pomocą głosu, interfejsów człowiek-maszyna (human-machine interfaces, HMI) o funkcjonalność identyfikacji osoby wydającej polecenie głosowe. Zaproponowane metody detekcji aktywności mówcy, detekcji kodowania i kodera GSM, a także doboru modelu mówcy skorelowanego z koderem mowy wyraźnie zwiększyły skuteczność rozpoznawania. Przedstawiono także implementację na procesorze ARM, oraz stałoprzecinkowym procesorze sygnałowym. Uwzględnienie zaproponowanych metod zwiększyło skuteczność rozpoznawania przede wszystkim dla implementacji stałoprzecinkowej oraz umożliwiło redukcję rozdzielczości akwizycji i przetwarzania sygnału mowy.

Number of pages

201

OECD domain

electrical engineering, electronics, computer engineering

KBN discipline

automation and robotics

Signature of printed version

DrOIN 1833

On-line catalog

to20179057

Full text of dissertation

Download file

Access level to full text

public

First review

Andrzej P. Dobrowolski

Place

Warszawa, Polska

Date

12.02.2017

Language

polish

Review text

Download file

Access level to review text

public

Second review

Andrzej Dobrucki

Place

Wrocław, Polska

Date

28.02.2017

Language

polish

Review text

Download file

Access level to review text

public

Dissertation status

dissertation

Place of defense

Poznań, Polska

Date of defense

29.05.2017

Unit granting title

Rada Wydziału Informatyki Politechniki Poznańskiej

Obtained title

doktor nauk technicznych w dyscyplinie: automatyka i robotyka, w specjalności: interfejsy człowiek-maszyna

System created by Poznań University of Technology and Poznan Supercomputing and Networking Center