
Dictionary learning based on M‐PCA‐N for audio signal sparse representation
Author(s) -
Yang Jichen,
He Qianhua,
Li Yanxiong,
Liu Leian,
Li Jianhong,
Feng Xiaohui
Publication year - 2018
Publication title -
iet signal processing
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.384
H-Index - 42
eISSN - 1751-9683
pISSN - 1751-9675
DOI - 10.1049/iet-spr.2015.0277
Subject(s) - computer science , speech recognition , dictionary learning , audio signal , sparse approximation , representation (politics) , pattern recognition (psychology) , signal (programming language) , artificial intelligence , audio signal processing , k svd , signal processing , speech coding , digital signal processing , politics , political science , law , programming language , computer hardware
The current popular dictionary learning algorithms for sparse representation of signals are K‐means Singular Value Decomposition (K‐SVD) and K‐SVD‐extended. Only rank‐1 approximation is used to update one atom at a time and it is unable to cope with large dictionary efficiently. In order to tackle these two problems, this study proposes M‐Principal Component Analysis‐N (M‐PCA‐N), which is an algorithm for dictionary learning and sparse representation. First, M‐Principal Component Analysis (M‐PCA) utilised information from the top M ranks of SVD decomposition to update M atoms at a time. Then, in order to further utilise the information from remaining ranks, M‐PCA‐N is proposed on the basis of M‐PCA, by transforming information from the following N non‐principal ranks onto the top M principal ranks. The mathematic formula indicates that M‐PCA may be seen as a generalisation of K‐SVD. Experimental results on the BBC Sound Effects Library show that M‐PCA‐N not only lowers the MSE between original signal and approximation signal in audio signal sparse representation, but also obtains higher audio signal classification precision than K‐SVD.