# February 2012 Archives

## Speedups in OMP Implementations

| 1 Comment
I am now compiling many results in a comparative study of several different implementations of OMP: the naive way, the Cholesky way, the QR way, and the MIL way.

So far, it appears that each implementation has some error accumulation from recursion, but at completely acceptable levels (at least on my 64-bit machine). Furthermore, I have yet to observe a case where one fails while the others succeed. Where they are significantly different, however, is in their computation times. Below we see in the phase plane how much faster the three implementations are compared to the naive way for particular ambient dimension \(N\). The z-axis is the mean time of the naive (averaged over 100 independent trials at each problem sparsity and indeterminacy) divided by the mean time of another implementation. (This is mean time of entire pursuit, not per iteration.) It is interesting to see how large of a benefit it is to use QR vs. Cholesky for large problem sizes and more iterations.

## Are you or someone you know learning digital signal processing?

Every so often I get an email from somewhere in the world from a student asking how to solve a particular part of some signal processing laboratory I wrote over five years ago. My labs, originally created for two classes at University of California, Santa Barbara, have somehow managed to immigrate. So now, for the first time ever, I have bundled them all together into one nice, easy to open zipfile: http://imi.aau.dk/~bst/teaching/SturmDSPLabs.zip.

Here is what you get in this fabulous free package, which combines signal processing with images and sound and fun all in MATLAB:

Seven laboratories designed for upper division students in digital signal processing:
1. MATLAB tutorial, and the moving average filter
2. Sampling
3. Sample rate alteration
4. Linear systems and digital filtering
5. Windowing
6. The discrete cosine transform
7. Filter banks
Nine laboratories designed for non-engineers studying media arts technology.
1. MATLAB and the space between your ears
2. MATLAB, even and odd signals, and someone else's code
3. The Gibb's phenomenon: You may already have been affected!!!
4. Fourier transformers: More than meets the eye
5. Coping with convolution coping with convolution
7. Moving averages
8. Damn smart reverb and converb
9. Flipping the bird, so to squeak
Along with each lab, I include some code and the signals used. I do not include any of the answers. You have to email me asking for them, which I will give upon proof that you are not a student. :) I should also point out the availability of my MATLAB software toolbox SSUM, which I should update soon with some changes ...

## Someone is wrong on the Internet

This all annoys me because it is yet another symptom of "science" continuing its journey to becoming a bad word in the USA. Politicians and fundangelicals have always preyed on the ignorant and naive; and now the less literacy in science there is, the better for making money and keeping power. Government funding of basic research has been under attack for many years by opportunist politicians. Who would have thought scientists are really in it for the power and money? Are they really the ones fleecing America? Climate change is definitely a bad word in the US, and I avoid it at all dinner conversations when I am not sure of the political leanings of the guests. Many Americans think the opinion of a weatherman is scientifically valid when it comes to climate change. Glossy colorful infographics like this seem to make many people believe they can argue against, e.g., evolution and climate change, because they are armed with facts and hard numbers. Science is so inconvenient because it is hard to understand! Truth is not so simple to those of us who believe context is important, and that the world is not black and white.

Many readers have mentioned to me the desire to leave comments here, but not to register. So, I have finally spent the required time to install reCaptcha. Now you may comment anonymously, pending my approval of course. :) The email address can be completely made up. Just include the @.

## Paper of the Day (Po'D): Sparse signal detection from incoherent projections Edition

Hello, and welcome to Paper of the Day (Po'D): Sparse signal detection from incoherent projections Edition. Today's paper provides an interesting view on whether one needs to reconstruct a compressively sensed (CS) signal before one can say anything about it: M. F. Duarte, M. A. Davenport, M. B. Waking and R. G. Baraniuk, "Sparse signal detection from incoherent projections," Proc. ICASSP, 2006. One thing I really like about this paper is that of its 14 references, 8 are "preprints", and another one is presented at the same conference. That shows timely work!

My one line summary of this paper is:
Non-adaptive sensing by random projection onto a low dimensional subspace might kill the recovery of the underlying signal, but it need not keep us from saying something useful about its composition.

## Paper of the Day (Po'D): Single-channel and multi-channel sinusoidal audio coding using compressed sensing Edition

Hello, and welcome to Paper of the Day (Po'D): Single-channel and multi-channel sinusoidal audio coding using compressed sensing Edition. Today's paper is the first to present the first audio coding system using compressed sensing (CS): A. Griffin, T. Hirvonen, C. Tzagkarakis, A. Mouchtaris, and P. Tsakalides, "Single-channel and multi-channel sinusoidal audio coding using compressed sensing," IEEE Trans. Acoustics, Speech, Lang. Process., vol. 19, pp. 1382-1395, July 2011. Also see its subreferences:
1. A. Griffin and P. Tsakalides, "Compressed sensing of audio signals using multiple sensors," in Proc. European Signal Process. Conf., (Lausanne, Switzerland), Aug. 2008.
2. A. Griffin, T. Hirvonen, A. Mouchtaris, and P. Tsakalides, "Encoding the sinusoidal model of an audio signal using compressed sensing," in Proc. IEEE Int. Conf. Multimedia Expo, (Cancun, Mexico), June 2009.
3. A. Griffin, C. Tzagkarakis, T. Hirvonen, A. Mouchtaris, and P. Tsakalides, "Exploiting the sparsity of the sinusoidal model using compressed sensing for audio coding," in Proc. SPARS'09, (St. Malo, France), Apr. 2009.
4. A. Griffin, T. Hirvonen, A. Mouchtaris, and P. Tsakalides, "Multichannel audio coding using sinusoidal modelling and compressed sensing," in Proc. European Signal Process. Conf., (Aalborg, Denmark), Aug. 2010.

## Infographics

| 1 Comment
Good infographics present data, visualize relationships and make clear surprising facts with style and flair using the medium of the web browser to view images tens of thousands of pixels long. Here is a whole blog devoted to the technique with great examples. I was recently asked my opinion of this newly created infographic on bad practices in science. Its creator is interested in feedback. My response is under the fold, but I encourage you to look at the graphic first, and, if interested, leave a comment or email me directly for his mail.

## Music genre recognition: Have the simplest problems even been solved yet?

Currently, with a heavy schedule of teaching, and preparing papers for submission to EUSIPCO and a journal special issue, I have little time to continue working on music genre recognition, e.g., assembling what I think would be a good database for testing music genre recognition. I also have not had time to more closely look at alternative datasets, as I have done for the Tzanetakis dataset. I have several thoughts about the topic though.

Through my review of the literature in this area, I feel that too often work tackles "music genre recognition" without having the faintest idea of what "genre" is and is not. A dataset is selected and assumed to be representative and valid. Many papers link together "novel" features with "novel" methods of classification, and provide little motivation for the choices made. Sometimes, justifications are made that don't make sense. The use of bags of frames of features (BFFs) is a spent topic. People don't listen that way. Feature integration offers some hope, but it still has a long way to go until we reach the high level descriptors that I find myself using when comparing styles and assigning labels.

If we want to solve complex problems, we first need to solve the simplest ones. In the work on music genre recognition so far, I have yet to see any results that convince me the simplest problems have been solved. For instance, an experiment I want to do soon will take a state-of-the-art system for music genre recognition, and see how it performs when I compress all input to the excessive levels used in Popular music. With such a transformation, does the genre change for human listeners? I don't think so. Classical will still be Classical, though with an in-your-face feel. If the system is really comparing stylistic aspects of the musical content of the audio signals, then such compression will have little affect. Similarly for other transformations, such as the bandwidth changes of AM radio, and the pops and hisses and subtle speech changes of 78s. I don't think I will see a little effect from these transformations.

To properly and convincingly solve the problem of automatic genre recognition, I think it must take high level models that can separate the mixed sources (find the guitar, drums, and bass for Rock), infer rhythms and beat patterns (determine the highhat pattern for Disco), transcribe the melody line and the harmonies (find the lack of parallel fifths in Baroque), listen to the lyrics and vocal styles (detect the topic and free singing style in Delta Blues), find the structure of the piece (find the verse-chorus-verse-chorus of Rock and Roll, or the 12 bars of 12-bar Blues), find and classify different audio effects (the spring reverberation of Reggae), and be able to link together styles historically (Hiphop grows from Reggae).

Bob L. Sturm, Associate Professor
Audio Analysis Lab
Aalborg University Copenhagen
A.C. Meyers Vænge 15
DK-2450 Copenahgen SV, Denmark
Email: bst_at_create.aau.dk