Speech Enhancement Based on Enhanced Empirical Wavelet Transform and Teager Energy Operator
[ 1 ] Instytut Elektrotechniki i Elektroniki Przemysłowej, Wydział Automatyki, Robotyki i Elektrotechniki, Politechnika Poznańska | [ 2 ] Instytut Informatyki, Wydział Informatyki i Telekomunikacji, Politechnika Poznańska | [ P ] pracownik | [ D ] doktorant
[2.2] Automatyka, elektronika, elektrotechnika i technologie kosmiczne[2.3] Informatyka techniczna i telekomunikacja
2023
artykuł naukowy
angielski
EN This paper presents a new speech-enhancement approach based on an enhanced empirical wavelet transform, considering the time and scale adaptation of thresholds for individual component signals obtained from the used transform. The time adaptation is performed using the Teager energy operator on the individual component signals, and the scale adaptation of thresholds is performed by the modified level-dependent threshold principle for the individual component signals. The proposed approach does not require an explicit estimation of the noise level or a priori knowledge of the signal-to-noise ratio as is usually needed in most common speech-enhancement methods. The effectiveness of the proposed method has been assessed based on over 1000 speech recordings from the public Librispeech database. The research included various types of noise (among others white, violet, brown, blue, and pink) and various types of disturbance (among others traffic sounds, hair dryer, and fan), which were added to the selected test signals. The score of perceptual evaluation of speech quality, allowing for the assessment of the quality of enhanced speech, and signal-to-noise ratio, allowing for the assessment of the effectiveness of disturbance attenuation, are selected for the evaluation of the resultant effectiveness of the proposed approach. The resultant effectiveness of the proposed approach is compared with other selected speech-enhancement methods or denoising techniques available in the literature. The experimental research results show that the proposed method performs better than conventional methods in many types of high-noise conditions in terms of producing less residual noise and lower speech distortion.
21.07.2023
3167-1 - 3167-21
Article number: 3167
CC BY (uznanie autorstwa)
otwarte czasopismo
ostateczna wersja opublikowana
21.07.2023
w momencie opublikowania
publiczny
140
2,6