Publication: Sparse component analysis based on adaptive time-frequency thresholding for underdetermined blind source separation
Loading...
Date
2023-08-01
Authors
Norsalina Binti Hassan
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Underdetermined Blind Source Separation (UBSS) refers to a scenario in
Blind Source Separation (BSS) where the number of mixtures produced is fewer than
the total number of source signals. In UBSS, the mixing matrix becomes noninvertible,
posing challenges in source recovery despite having knowledge of the
matrix. Sparse Component Analysis (SCA) offers a general solution for UBSS,
capitalizing on sparse source signals and involving mixing matrix estimation and
source recovery estimation. The primary focus of this thesis is to enhance the accuracy
of the estimated mixing matrix in underdetermined cases. A previously proposed
algorithm employed a predetermined threshold to select significant signal coefficients
from the time-frequency (TF) representation prior to Single Source Points (SSPs)
detection. However, using a fixed threshold leads to unstable accuracy in mixing
matrix estimation when applied to different source mixtures. To address this issue, we
propose Adaptive Time-Frequency Thresholding (ATFT). ATFT adaptively selects
significant TF coefficients from the TF mixtures, thereby improving the accuracy of
the mixing matrix estimation across various source mixtures. After identifying SSPs,
clustering is typically performed to approximate the mixing matrix. One drawback of
using classical clustering algorithms is their sensitivity to the selection of initial
centroid positions. In this work, we introduce Particle Swarm Optimization with
Hierarchical (PSOH) clustering and Particle Swarm Optimization with K-means
(PSOK) clustering methods to mitigate this issue. The second step of SCA involves source recovery estimation using the least square method. Experimental comparisons
have demonstrated that our proposed ATFT method outperforms benchmark methods
(Zhen, DUET, TIFROM, V.G. Reju) by achieving the lowest error rates of 0.116,
0.1363, 0.1006, and 0.1154 on Frog Identification Expert System Database, Frogs of
Australia Database, Frog Watch Database, and British Library Amphibian Database,
respectively. The accuracy of mixing matrix estimation is further enhanced by
employing the PSOH and PSOK clustering methods, indicating effective separation of
bioacoustic signals and resulting in higher values of SDR, SIR, and SAR, thereby
signifying improved source separation quality. Ultimately, ATFT with the PSOH
technique exhibits superior separation performance compared to other techniques
(ATFT+PSOK, ATFT+Hierarchical, ATFT+K-means), achieving SDR values of
14.35 dB, 14.82 dB, 13.35 dB, and 13.34 dB for source 1, source 2, source 3, and
source 4, respectively.