Depending on the amount of data to process, file generation may take longer.

If it takes too long to generate, you can limit the data by, for example, reducing the range of years.

Article

Download file Download BibTeX

Title

Lossy Coding Impact on Speech Recognition with Convolutional Neural Networks

Authors

Year of publication

2022

Published in

Vibrations in Physical Systems

Journal year: 2022 | Journal volume: vol. 33 | Journal number: no. 3

Article type

scientific article

Publication language

english

Keywords
EN
  • lossy coding
  • convolutional neural networks
  • speech recognition
Abstract

EN This paper presents research of lossy coding impact on speech recognition with convolutional neural networks. For this purpose, google speech commands dataset containing utterances of 30 words was encoded using four most common all-purpose codecs: mp3, aac, wma and ogg. A convolutional neural network was taught using part of the original files and later tested with the rest of the files, as well as their counterparts encoded with different codecs and bitrates. The same network model was also taught using mp3 encoded data showing the biggest loss in effectiveness of the previous network. Results show that lossy coding does have an effect on speech recognition, especially for low bitrates.

Pages (from - to)

2022302-1 - 2022302-6

DOI

10.21008/j.0860-6897.2022.3.02

URL

https://vibsys.put.poznan.pl/_journal/2022-33-3/articles/vps_2022302.pdf

Comments

article number: 2022302

License type

CC BY (attribution alone)

Open Access Mode

open journal

Open Access Text Version

final published version

Full text of article

Download file

Access level to full text

public

Ministry points / journal

70

This website uses cookies to remember the authenticated session of the user. For more information, read about Cookies and Privacy Policy.