November 2010 Archives

I have run some experiments to compare the performance of CMP to MP and OMP in recovering compressively sensed sparse signals. Given a real length-\(N\) signal \(\vx\) with sparsity \(K\), I sense it using \(\mathbf{\Phi} \in \mathcal{R}^{M\times N}\), which in my case has elements iid zero mean Gaussian, and each column is made unit norm. The \(K\)-sparse signal has \(K\) of its elements (uniformly selected) selected from a standard normal distribution. I sense this signal and obtain the observation \(\vy = \mathbf{\Phi} \vx\). Using a greedy algorithm (MP, OMP, CMP) I attempt to recover \(\vx\) by selecting columns of \(\mathbf{\Phi}\) to "explain" \(\vy\). I stop this process when either \(M\) atoms have been selected, or \(||\vy - \mathbf{\Phi} \hat \vx||_2^2/||\vy||_2^2 < 10^{-10}\) where \(\hat\vx\) is the reconstruction. In my experiments, \(N = 250, M = 50\), and the sparsity \(K \in [1, 30]\). I test each algorithm for 1000 different realizations of \(\vx\) and \(\mathbf{\Phi}\). For CMP, I cycle until either the difference between successive SRRs is less than 0.001, or we have cycled through all atoms 10 times.

Below we see the probability of "exact recovery" as a function of sparsity. I define "exact recovery" when \(||\vx - \hat\vx||_2^2 \le 10^{-10}\). As expected, MP is terrible. OMP does well, and CMP based on MP does stunningly well. (Stunning perhaps because I am not impartial.)

RecoveryProbability.jpg Though CMP slightly under-performs OMP, we see below that it slightly out-performs OMP when it comes to the mean \(\ell_2\) reconstruction error. Why then MP beats both of those after 0.45 is discomforting, but maybe not entirely unusual since its weights are exponentially decreasing.

ell2normerror.jpg By the way, once I am finished running some more tests I will post my code. :)

Call for Music

| No Comments
8th Sound and Music Computing Conference 06-09 July 2011, Padova, Italy
Department of Information Engineering, University of Padova
Conservatorio Cesare Pollini, Padova

Call for MUSIC!

Turkey Research!

| No Comments
With the snow falling outside my warm office, and my novelty mug filled with cocoa served piping hot from the Wittenborg 7100, and thoughts of what I am missing back in the US this Thanksgiving day, I was all set to digest a days-old paper from a reputable resource, and make a new entry to my collection of annotated references. This paper turned out to be a real turkey though, revealing the authors to be completely ignorant of work from the past 18 years, not to mention fundamental concepts of music. This paper does provide an exceptional example of poor research, and of how peer review can go so wrong. And this inspires me to write a list of the things in research for which I am thankful.

  1. I am thankful for the research that contains a proper review of the literature.
  2. I am thankful for the research that avoids making blanket assumptions and unusual categorizations.
  3. I am thankful for the research that uses the word "optimal" and defines its context.
  4. I am thankful for the research that properly justifies the decisions made.
  5. I am thankful for the research that clearly states the procedures, algorithms, and the various settings used in the experiments.
  6. I am thankful for the research by those who work hard to understand fundamental concepts in the areas they work, such as, in audio and music signal processing, concepts about acoustics and music like modes and overtones, harmonicity, timbre, tonality, and polyphony.
  7. I am so thankful for the research that contains a proper review of the literature.
  8. I am thankful for the research that does not make claims that have been clearly disproved in the literature nearly 18 years prior, and well-known since.
  9. I am thankful for the research by authors that does not cite their own submitted work as if it they are the first to study the problem at hand when there is at least 10 years of work on that same problem.
  10. I am thankful for the research that includes meaningful figures I can read without a microscope.
  11. I am thankful for the reviewers who spend time not only carefully reviewing the submitted work, but also familiarizing themselves with the literature enough to know when work is strong.
  12. And most of all, I am thankful for the research that is a clear contribution to our pursuit of knowledge.
Happy Thanksgiving!

Sparsity-Spark-Coherence Conundrum

| No Comments
Say I have a discrete signal of length \(K\), which we assume to be a perfect square. And assume its sparsity is \(K\) in a Dirac basis. This means the signal is not sparse at all with respect to the time domain. Assume as well that the sparsity of its Fourier transform is also \(K\), which means that the signal is not sparse at all with respect to the frequency domain.

Question 1: Do such signals exist?

20101125: Here is my code that you can use to reproduce the figures in our paper! Thanks for the motivation Igor! :)

Hello, and welcome to Paper of the Day (Po'D): Cyclic Matching Pursuit with a Time-frequency Dictionary Edition. Today's paper is my own, just finalized for the proceedings of Asilomar 2010, and presented nary more than a few weeks ago: B. L. Sturm and M. G. Christensen, "Cyclic matching pursuits with multiscale time-frequency dictionaries," in Proc. Asilomar Conf. Signals, Systems, and Computers, (Pacific Grove, CA), Nov. 2010. I have discussed Cyclic Matching Pursuit (CMP) in a previous Po'D.

In the interest of extreme brevity, here is my one line description of the work in this paper:
Fusing cyclic minimization with MP using a multiscale time-frequency dictionary can produce signal models superior to those produced by more computationally complex greedy iterative descent approaches, such as OMP, with respect to the norm residual.

Sensing Matrix Cathedral Window?

| 1 Comment
On Friday, I gave a seminar at the Fraunhofer Institute in Bonn, Germany (slides here), which gave me the opportunity to visit Cologne just 20 minutes north. This included a visit to the great Cologne Cathedral, where I found an interesting stained glass window:

IMG_1631.jpg This work from 2007 by German artist Gerhard Richter reminds me of (the transpose of a) random sensing matrix. It is almost described so on his Wikipedia page:
It is an 113 square metre abstract collage of 11,500 pixel-like squares in 72 colors, randomly arranged by computer (with some symmetry), reminiscent of his 1974 painting "4096 colours". Richter designed the window for free. Cardinal Joachim Meisner did not attend the window's unveiling; he had preferred a figurative representation of 20th century Christian martyrs and said that Richter's window would fit better in a mosque or prayer house.
I wonder if Cardinal Meisner prefers instead a Bernoulli construction? Or maybe he insists on directly minimizing the sparsity of the solution? Are greedy methods evidence of an excessive culture speeding toward its demise?
Hello, and welcome to Poster of the Day (Po'D): Signal Synthesis with Conic Matching Pursuit Edition. With this poster, we add yet another greedy sparse approximation method with the acronym CMP: First, there was Cyclic MP; then there was Complementary MP. Then there was Consensus MP. And now we have Conic MP: J. Carmichael, "Signal Synthesis with Conic Matching Pursuit," presented somewhere and sometime, perhaps at the Alumni Reunion and Celebration of Applied Mathematics of at the University of Washington?
Just received this too! It is a CFP day! This topic is near and dear to my heart in fact, since in my past I developed a fondness for reviewing US Code, case testimony, judicial decisions, and contradictory precedents: B. L. Sturm, "Concatenative sound synthesis and intellectual property: An analysis of the legal issues surrounding the synthesis of novel sounds from copyright-protected work," J. New Music Research, vol. 35, no. 1, pp. 23-33, 2006. For every Lawrence Lessig and "creative commons" advocate, there are hundreds of lawyers working for MPAA and RIAA convincing education institutions to report their students to the police, or countries to shut families out from information based on little proof.

Call for Papers: Rethink Music Conference

In connection with the "Rethink Music" event, the Berkman Center and Journal of Sports and Entertainment Law (JSEL) seek papers that propose changes to intellectual property or other laws that relate to the creation, production, distribution, performance, or other use of musical works.

Papers will be selected from among those submitted by a committee designated by the staff of JSEL and/or by staff and/or Faculty Directors of the Berkman Center for Internet & Society and/or their designees (including representatives from other academic institutions and/or others with expertise in music or intellectual property business and/or policy).

Up to four (4) authors whose papers are selected will be invited to present their work at the "Rethink Music" symposium. Berklee College of Music will provide economy airfare and hotel accommodations for any authors invited to present at the symposium. Up to two (2) papers will be considered for publication in JSEL. The number of authors selected to present at the symposium and the number of authors to whom prizes are awarded will depend upon the quantity and quality of eligible papers submitted. The Berkman Center and JSEL reserve the right, in their sole discretion, to decide that no papers will be presented at the Rethink Music symposium and/or published in JSEL.

The Berkman Center may contact authors of papers not selected for publication or presentation at the symposium regarding publication via another medium - e.g., online, via the Berkman Center website.

Just received the following CFP.

In Conjunction with the IEEE International Conference on Multimedia and Expo (ICME) Barcelona, Spain, July 11-15, 2011

Date of the Workshop: July 11/15, 2011
Paper Submission: February 20, 2011

What is happening here?

| No Comments
Quiz time! Given the discussion over the past few days with regards to estimating the tuning frequency of an equal temperament system, what is going on in the signal below?

what.jpg Answer is below the fold!
Continuing where I left off a few days ago, let's look at how well this technique works at determining the tuning frequency of some sequence of notes in a 12-tone equal tempered system. I have synthesized a whole bunch of random pitches (tuning frequency \(f_t = 442.33\) Hz) with overtones (not perfect harmonics) and noise (additive Gaussian white) and what not to create a complex polyphonic texture I title untitled #2 (Nov. 15, 2010). The sonogram of this composition is seen at top in the figure below.

Tuning frequency determination

| No Comments
In the construction of harmonic pitch class profiles (HPCP), detailed in Chapter 3 of E. Gómez, "Tonal description of music audio signals," Ph. D. thesis, Music Technology Group, Univ. Pompeu Fabra, Barcelona, Spain, 2006, I am looking at the estimation of the reference tuning of instruments in a piece of music, which is the pitch everyone tunes to in equal temperament. In the USA this is frequently 440 Hz. In Europe it can be as high as 444 Hz. In Bach's time, it was 415 Hz! Anyhow, this estimation is important because ultimately we need to fold each overtone series into a pitch class. Gómez describes the process, but in a way that confuses me. So here I take another route.

"I'm a math person."

| No Comments
Apparently a vacation cruise turned into a nightmare for 4,500 passengers after a fire rendered the ship useless at sea for three days. Navy helicopters flew in rations of spam and pop tarts, and the toilets were not working. So the cruise line offered all passengers a refund and a free cruise. Of this offer one of the passengers said, "I'm a math person. What are the chances this would happen twice to the same person? I'm going with the odds. We're from Vegas. We're coming back." The question is this:

Should I buy a ticket for the cruise on which he will travel because then my chance of being waylaid will be smaller with him on board than if not?
Math people. From Vegas. Gambler's fallacy win!

Fail_Boat.jpg By the way, my wife smiled at this story and said it reminds her of when she used to prefer to park her car next to other cars that had visible accidental damage, because they would be the least likely to be smashed again.

Conference craze 2011

| No Comments
2011 is going to be a crowded year in terms of music, sound, interaction, and/or motor control related conferences. Here are just a few:
Continuing with my reproduction of the work in M. Casey, C. Rhodes, and M. Slaney, "Analysis of minimum distances in high-dimensional musical spaces," IEEE Trans. Audio, Speech, Lang. Process., vol. 16, pp. 1015-1028, July 2008, today I inspected the squared distances between shingles of songs related and unrelated. Before I proceed, however, I must say that I have had to make many guesses at how Casey et al. calculate their features:
  1. I calculate log-frequency cepstral coefficients (LFCCs) using rectangular (or triangular) bandpass filters weighted inversely to their area. (Casey et al. do not mention what weighting they use. I just adopted what is done in Slaney's Auditory Toolbox for calculating MFCCs.)
  2. In the 20 coefficients, I include the zeroth DCT coefficient.
  3. To calculate the "power" of an LFCC shingle (which Casey et al. use to discard those that come from quiet moments), I sum all zeroth coefficient magnitudes. (Do Casey et al. use the squared norm of a shingle?)
Continuing from yesterday, Casey et al. 2008 wish to find the probability distribution of the minimum among the squared distances of \(N\) shingle pairs from unrelated songs. With this we can say two songs are related when the minimum squared distance among their shingle pairs is inferior.
After having reproduced the log-frequency cepstral coefficient and pitch-class profile shingle extraction technique of Casey et al. 2008 (M. Casey, C. Rhodes, and M. Slaney, "Analysis of minimum distances in high-dimensional musical spaces," IEEE Trans. Audio, Speech, Lang. Process., vol. 16, pp. 1015-1028, July 2008, Po'D here), it is time for me to understand the derivation of their decision criteria for audio shingles being significantly close in a Euclidean sense.
Hello, and welcome to the Paper of the Day (Po'D): Sparsity and Classifiability Edition. Todays paper comes from just last month: M. Moussalam, T. Fillon, G. Richard and L. Daudet, "How Sparsely Can a Signal be Approximated while Keeping its Class Identity?", in Proc. of Music and Machine Learning Workshop at ACM Multimedia, Firenze, Italy, Oct. 2010.
Hello, and welcome to Paper of the Day (Po'D): Music Cover Song Identification Edition, pt. 5 (of how many more I don't know). We continue today with a recent paper: T. E. Ahonen, "Combining chroma features for cover version identification," in Proc. Int. Soc. Music Info. Retrieval, Utrecht, Netherlands, Aug. 2010.
Hello, and welcome to Paper of the Day (Po'D): Clustering Beat-chroma Patterns in Music Databases Edition. Today's paper offers a break from the cover-song-centric Po'Ds we have been having lately; but it is still related: T. Bertin-Mahieux, R. J. Weiss, and D. P. W. Ellis, "Clustering beat-chroma patterns in a large music database," in Proc. Int. Symp. Music Info. Retrieval, (Utrecht, Netherlands), Aug. 2010.
Hello, and welcome to Paper of the Day (Po'D): Music Cover Song Identification Edition, pt. 4 (of many more). We continue today with a novel approach: J .Serrà, H. Kantz, and R. G. Andrzejak, "Model-based cover song detection via threshold autoregressive forecasts," in Proc. ACM Multimedia Int. Workshop on Machine Learning and Music, Firenze, Italy, Oct. 2010.
Hello, and welcome to Paper of the Day (Po'D): Music Cover Song Identification Edition, pt. 3. We continue today with a paper mentioned in the post from a few days ago: J. Serrà, E. Gómez, and P. Herrera, "Audio cover song identification and similarity: Background, approaches, evaluation, and beyond," in Advances in Music Information Retrieval (Z. Ras and A. Wieczorkowska, eds.), vol. 274 of Studies in Computational Intelligence, pp. 307-332, Springer, 2010.
We are holding a free PhD course at Aalborg University (AAU), Aalborg, Denmark, during December 6 - 10, during which we will review past, current, and future research in audio and music analysis, cognition, and synthesis. The teachers are all from AAU The course takes its starting point in music perception and cognition, i.e., how humans perceive and understand music and computational models of this process. Then, the problem of characterizing recorded audio signals in general and music signals in particular in terms of physically or perceptually meaningful parameters is considered with emphasis on spectral and pitch estimation. Finally, parametric and/or interactive methods for re-synthesis and 3D spatialization of sounds are presented. The course will cover the following topics:
  • Music perception
  • Music cognition and computational modeling thereof
  • Modeling of audio signals
  • Single- and multi-pitch estimation
  • Advanced audio synthesis techniques
  • Spatial (3D) synthesis
  • Music performance and interaction
This will be a fun week filled with a variety of demonstrations and exercises. The deadline to register is November 15, 2010.

Blog Roll

About this Archive

This page is an archive of entries from November 2010 listed from newest to oldest.

October 2010 is the previous archive.

December 2010 is the next archive.

Find recent content on the main index or look in the archives to find all content.