April 2011 Archives

I have been working with Pardis Noorzad on STWOPetal: studying the work of Y. Panagakis, C. Kotropoulos, and G. R. Arce, "Music genre classification via sparse representations of auditory temporal modulations," in Proc. European Signal Process. Conf., (Glasgow, Scotland), pp. 1-5, Aug. 2009; Y. Panagakis, C. Kotropoulos, and G. R. Arce, "Music genre classification using locality preserving non-negative tensor factorization and sparse representations," in Proc. Int. Soc. Music Info. Retrieval Conf., pp. 249-254, Kobe, Japan, Oct. 2009; Y. Panagakis, C. Kotropoulos, and G. R. Arce, "Non-negative multilinear principal component analysis of auditory temporal modulations for music genre classification," IEEE Trans. Acoustics, Speech, Lang. Process., vol. 18, pp. 576-588, Mar. 2010. (Po'D here.) In Part 1, we sufficiently tackled the first part, which was creating auditory spectrograms. In Part 2, we successfully performed "modulation-scale analysis" on the auditory spectrograms to derive "auditory temporal modulations." In Part 3, we implemented and tested sparse representation classification (SRC), and compared it with other classification methods.
From Risto Holopainen, a musicology researcher at the University of Oslo, Norway:

Please try the Autonomous Instrument Song Contest! This is a listening test where you will listen to a number of sound examples. You may vote for your favourite sound example, and evaluate the complexity of the examples. The test takes about 20 minutes. The questions are in English, but you may answer either in English, French, or Scandinavian.

Autonomous Instruments are the synthesis models that I'm doing my current research on. By participating in the experiment you will make a contribution to my PhD project. Please feel free to distribute the invitation to anyone you think might enjoy participating!

It's open until April, 30.

I have been working with a student on STWOPetal: studying the work of Y. Panagakis, C. Kotropoulos, and G. R. Arce, "Music genre classification via sparse representations of auditory temporal modulations," in Proc. European Signal Process. Conf., (Glasgow, Scotland), pp. 1-5, Aug. 2009; Y. Panagakis, C. Kotropoulos, and G. R. Arce, "Music genre classification using locality preserving non-negative tensor factorization and sparse representations," in Proc. Int. Soc. Music Info. Retrieval Conf., pp. 249-254, Kobe, Japan, Oct. 2009; Y. Panagakis, C. Kotropoulos, and G. R. Arce, "Non-negative multilinear principal component analysis of auditory temporal modulations for music genre classification," IEEE Trans. Acoustics, Speech, Lang. Process., vol. 18, pp. 576-588, Mar. 2010. (Po'D here.) In Part 1, we sufficiently tackled the first part, which was creating auditory spectrograms. In Part 2, we successfully performed "modulation-scale analysis" on the auditory spectrograms to derive "auditory temporal modulations." In this part, we have implemented and tested sparse representation classification (SRC), and compared it with other classification methods. For this, Panagakis et al. point to the paper, J. Wright, A. Y. Yang, A. Ganesh, S. S. Sastry, and Y. Ma, "Robust face recognition via sparse representation," IEEE Trans. Pattern Anal. Machine Intell., vol. 31, pp. 210-227, Feb. 2009.

I pointed to the paper by Wright et al. in my discussion of work on classifying facial emotions. I have also discussed the application of SRC to robust spoken digit recognition, and music genre recognition by compressive sampling.
I have been working with a student on STWOPetal: studying the work of Y. Panagakis, C. Kotropoulos, and G. R. Arce, "Music genre classification via sparse representations of auditory temporal modulations," in Proc. European Signal Process. Conf., (Glasgow, Scotland), pp. 1-5, Aug. 2009; Y. Panagakis, C. Kotropoulos, and G. R. Arce, "Music genre classification using locality preserving non-negative tensor factorization and sparse representations," in Proc. Int. Soc. Music Info. Retrieval Conf., pp. 249-254, Kobe, Japan, Oct. 2009; Y. Panagakis, C. Kotropoulos, and G. R. Arce, "Non-negative multilinear principal component analysis of auditory temporal modulations for music genre classification," IEEE Trans. Acoustics, Speech, Lang. Process., vol. 18, pp. 576-588, Mar. 2010. (Po'D here.) In Part 1, we sufficiently tackled the first part, which was creating auditory spectrograms. The second part entails using "modulation-scale analysis" to derive "auditory temporal modulations" from these spectrograms. For this, Panagakis et al. point to the paper, S. Sukittanon, L. E. Atlas, and J. W. Pitton, "Modulation-scale analysis for content identification," IEEE Trans. Signal Process., vol. 52, no. 10, pp. 3023-3035, Oct. 2004.
I received some reader mail the other day, which reminded me to make available the MATLAB code I used to produce results for my paper "A Study on Sparse Vector Distributions and Recovery from Compressed Sensing". Here is the zipfile: PhaseSturm.zip. Included is a .mat file of my simulation results, my original code used to produce these results, and several plot routines to see the results. Look at the README.txt for more information.
I am working with a student now on STWOPetal: studying the work of Y. Panagakis, C. Kotropoulos, and G. R. Arce, "Music genre classification via sparse representations of auditory temporal modulations," in Proc. European Signal Process. Conf., (Glasgow, Scotland), pp. 1-5, Aug. 2009; Y. Panagakis, C. Kotropoulos, and G. R. Arce, "Music genre classification using locality preserving non-negative tensor factorization and sparse representations," in Proc. Int. Soc. Music Info. Retrieval Conf., pp. 249-254, Kobe, Japan, Oct. 2009; Y. Panagakis, C. Kotropoulos, and G. R. Arce, "Non-negative multilinear principal component analysis of auditory temporal modulations for music genre classification," IEEE Trans. Acoustics, Speech, Lang. Process., vol. 18, pp. 576-588, Mar. 2010. (Po'D here.) The first part of their process entails creating auditory spectrograms of segments of recorded music. For this, they point to the following work: X. Yang, K. Wang, and S. A. Shamma, "Auditory representations of acoustic signals," IEEE Trans. Info. Theory, vol. 38, no. 2, pp. 824-839, Mar. 1992.
So maybe I was a little more than 5 days away from a paper submission, but I have finally finished it. Here is the complete summary of my findings from the past few months on this interesting subject --- a version of which I just submitted to IEEE Signal Processing Letters. Here is the abstract:

I empirically investigate the variability of several recovery algorithms on the distribution underlying the sparse vector sensed by a random matrix. a dependence that has been noted before, but, to my knowledge, not thoroughly investigated. I find that $\ell_1$-minimization \cite{Chen1998} and tuned two-stage thresholding \cite{Maleki2010} (subspace pursuit \cite{Dai2009} without the use of a sparsity oracle) are the most robust to changes in the sparse vector distribution; but they are outperformed to a large degree by greedy methods, such as orthogonal matching pursuit \cite{Pati1993} for sparse vectors distributed Normal and Laplacian. I also find that selecting the best solution from those produced by several recovery algorithms can significantly increase the probability of exact recovery.
Now, with more time to kill, I am running the same experiments but the dimensionality \(N=800\), which is what Maleki and Donoho use. I am interested to see how the recovery performance of iterative hard thresholding changes.

Blog Roll

About this Archive

This page is an archive of entries from April 2011 listed from newest to oldest.

March 2011 is the previous archive.

May 2011 is the next archive.

Find recent content on the main index or look in the archives to find all content.