Non-Negative Matrix Factorization Based Single Channel Source Separation
DOI:
https://doi.org/10.17762/ijcnis.v15i2.6132Keywords:
Automatic Speech Recognition, Matrix Factorization, Neural Network, Source Mixing, Wavelet TransformAbstract
The significance of speech recognition systems is widespread, encompassing applications like speech translation, robotics, and security. However, these systems often encounter challenges arising from noise and source mixing during signal acquisition, leading to performance degradation. Addressing this, cutting-edge solutions must effectively incorporate temporal dependencies spanning longer periods than a single time frame. To tackle this issue, this study introduces a novel model employing non-negative matrix factorization (NMF) modelling. This technique harnesses the scattering transform, involving wavelet filters and pyramid scattering, to compute sources and mitigate undesired signals. Once signal estimation is achieved, a source separation algorithm is devised, employing an optimization process grounded in training and testing approaches. By quantifying performance metrics, a comparative analysis is conducted between existing methods and the proposed model. Results indicate the superior performance of the suggested approach, underscored by these metrics. This signifies that the NMF and scattering transform-based model adeptly addresses the challenge of effectively utilizing temporal dependencies spanning more than a single time frame, ultimately enhancing speech recognition system efficacy.![](https://ijcnis.org/public/journals/1/article_6132_cover_en_US.png)
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 International Journal of Communication Networks and Information Security (IJCNIS)
![Creative Commons License](http://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png)
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.