Recently in Audio Signals Category

The Sonification Handbook

For those that have not yet heard: The Sonification Handbook edited by Thomas Hermann, Andy Hunt, John G. Neuhoff is published. And, even better, freely available for download here!

SMC 2012 in Copenhagen!!

9th Sound and Music Computing Conference, 12-14 July 2012
Medialogy section,  Department of Architecture, Design and Media Technology, Aalborg University Copenhagen

The SMC Conference is the forum for international exchanges around the
core interdisciplinary topics of Sound and Music Computing,
and features workshops, lectures, posters, demos, concerts, sound installations, and
satellite events. The SMC Summer School, which takes place just before the
conference, aims at giving young researchers the opportunity to
interactively learn about core topics in this interdisciplinary field from experts,
and to build a network of international contacts.
The specific theme of SMC 2012 is "Illusions", and
that of the SMC Summer School is "Multimodality".

================Important dates=================
Deadline for submissions of music and sound installations: Friday, February 3, 2012
Deadline for paper submissions: Monday 2 April, 2012
Notification of music acceptances: Friday, March 16, 2012
Deadline for applications to the Summer School: Friday March 30, 2012
Notification of acceptance to Summer School: Monday April 16, 2012
Deadline for submission of final music and sound installation materials: Friday, April 27, 2012
Notification of paper acceptances: Wednesday 2 May, 2012
Deadline for submission of camera-ready papers: Monday 4 June, 2012
SMC Summer School: Sunday 8 - Wednesday morning 11 July, 2012
SMC Workshops: Wednesday afternoon 11 July, 2012
SMC 2011: Thursday 12 - Saturday 14 July, 2012

SMC2012 will cover topics that lie at the core of the Sound and Music Computing research and creative exploration.
 We broadly group these into:
  - processing sound and music data
  - modeling and understanding sound and music data
  - interfaces for sound and music creation
  -music creation and performance with established and novel hardware and software technologies

================Call for papers==================
SMC 2012 will include paper presentations as both lectures and poster/
demos. We invite submissions examining all the core areas of the Sound
and Music Computing field. Submission related to the theme "Illusions" are especially encouraged.
 All submissions will be peer-reviewed according to their novelty, technical content, presentation, and
contribution to the overall balance of topics represented at the
conference. Paper submissions should have a maximum of 8 pages
including figures and references, and a length of 6 pages is strongly
encouraged. Accepted papers will be designated to be presented either
as posters/demos or as lectures. More details are available at

================Call for music works and sound installations==================
SMC 2012 will include four curated concerts addressing the conference topic "Illusions". We invite submissions of original compositions created for acoustic instruments and electronics, novel instruments and interfaces, music robots, and speakers as sound objects. Submissions of sound installation are also encouraged. See curatorial statements and call specifics at:

SPARS 2011, day 4

The fourth and final day of SPARS 2011 served up two plenaries by two prodigious reserarchers: Joel Tropp and Stephen Wright. At the beginning of his talk, Tropp asked who in the room knows how MATLAB computes the SVD. Only a few out of about 200 raised their hand, and a few more gestured that they kind of knew. The problem is that the methods we use today are treated as black boxes, but are based on extremely optimized classical methods that are incapable of working with massive matrices (billions by billions and up). So, we need better tools. He presented his work in SVD by a randomized algorithm ... which at first sounds scarily inaccurate, but proves to be extremely effective at a much reduced computational cost.

In the last plenary, Wright presented a lot of work in state of the art methods for regularized optimization. At the beginning, he showed some fantastic pictures that he called an "Atlas of the Null Space," which showed where solutions to min l1 are the same as min l0. His talked centered around the message that though we talk a lot of exact solutions, or sparsest representations, most applications in the real world only need good algorithms that give the correct support before the whole solution. The trick is to determine when to stop an algorithm, and post-process the results to find the better solution.

In between these talks, there were plenty others, discussing various items of interest with dictionary learning, audio inpainting (Po'D coming soon), and several posters, one of which is by CRISSP reader Graham Coleman. He presented his novel work applying l1 minimization of sound feature mixtures to drive concatenative sound synthesis, or musaicing. (I have discussed an earlier version of this work here.) Coleman's approach appears to be the next generation of concatenative synthesis.

All in all, this workshop was an excellent use of my time and money. Its duration was just perfect that after the last session I really felt as if my fuel tank was completely full. The organizers did an extremely nice job of selecting plenary speakers, assembling a wide range of quality work, and finding an accommodating venue with helpful staff. I even heard that the committee was able to raise enough funds so that many of the student participants had their accommodations paid for. I am really looking forward to the 2013 edition of SPARS (or CoSPARS).

CMP in MPTK: Third Results

In a previous entry, I compared our results with those produced by my own implementation of CMP in MATLAB --- which did not suffer from the bug because it computes the optimal amplitude and phases in a slow way with matrix inverses. Now, with the new corrected code, I have produced the following results. Just for comparison, here are the residual energy decays of my previous experiments, detailed in my paper on CMP with time-frequency dictionaries.

dsfdsf.jpg Now, with the corrections, I observe the decays. The "MPold" decay is that produced by the uncorrected MPTK. "MP" shows that of the new code. Only in Attack and Sine do we see much difference; and at times in Sine the previous version of MPTK beats the corrected version. (Such is the behavior of greedy algorithms. I will write a Po'D about this soon.) Anyhow, the decays of CMP-\(\ell\) (where the number denotes the largest number of possible cycles of refinement, but I suspend refinement cycles when energyAfter/energyBefore > 0.999), comports with the decays I see in my MATLAB implementation (see above). So, now I am comfortable moving on.

CMPtests01.png Below we see the decays and cycle refinements for three different CMPs for these four signals. (Note the change in the y axes.) Bimodal appears to benefit the most in the short term from the refinement cycles, after which improvement is sporadic. The modeling of Sine has a flurry of improvements. It is interesting to note that as \(\ell\) increases, we do not necessarily see better models with respect to the residual energy. For instance, for Attack, the residual energy for CMP-1 beats the others.

CMPtests01b.png And briefly back to the glockenspiel signal, below we see the decays and improvements using a multiscale Gabor dictionary (up to atoms with scale 512 samples). glock2_energydecay.png
I have solved the mystery that has pushed me for the past week into excruciatingly fun debugging sessions. Yes, I know I mentioned on June 9 that CMP was extremely easy to implement in MPTK. Then came second thoughts as to the behavior of the implementation. And there followed more observations, and rambling observations, and then the videos appeared. And then the music video appeared. Well, now here's another:

Don't Give Up

This is my life the past few days. And yet again, I think I have it cornered. The same thing happens for atoms at the Nyquist frequency. Now, how to fix it?
MPTK works!

In my experiments before, the MP reconstruction algorithm was hard clipping all values with magnitude greater than 1. So that is from where the spikes come. Oh, for F's sakes.
Today I have been experimenting with CMPTK and a real audio signal. With this larger signal, the energy errors by which I have been plagued this last week seem to be much more rare.

Below we see the residual energy decay of this example with MP and CMPTK using a dictionary of Gabor atoms (Gaussian window) of only two scales: 128/32 and 4906/64/8192 (scale/hop/FFTsize if different from scale). I run 200 iterations. CMP-\(l\) is implemented such that all representations at each order undergo at least one cycle. When \(l = 5\), more refinement cycles can be performed until the ratio of residual energies before and after a cycle is less than 1.002, or less than about 0.009 dB. I also plot in this graph, the "cycle energy decrease," which is the ratio of the residual energy before and after the entire refinement at the iteration. We find a few large spikes of improvement. At the end of 200 iterations, the models produced by CMP have an error 2.2 dB better than that produced by MP.

CMP in MPTK: First Results

From June 7 to 17, I am visiting Rémi Gribonval and other colleagues at L'IRISA (Institut de recherche en informatique et systèmes aléatoires) in Rennes, France, where I am on a scientific mission supported by the French Ambassador to Denmark. My goals this first visit are twofold:
  1. implement cyclic MP (CMP) within the MPTK framework;
  2. empirically study its application to real audio and music signals.
Before this visit, CMP only existed in the MATLAB backwaters of few esoterics; and there, it is only usable for small dictionaries, and only applied to short signals. We have now finished implementing CMP in MPTK, which required only simple modifications --- no doubt due to the well-thought architecture of the MPTK library. And now begins the more difficult process of determining the cost-benefit trade-offs for CMP and other variants. The plot below shows the first result.

energydecay.jpg Stay tuned for many more results!
In the past few weeks, Pardis and I have been looking at "deep belief networks", and their application to learning features for audio and music classification: DBNs, and convolutional DBNs. I am still completely hazy on the details, but in this talk, Professor Ng provides an excellent overview of the power of such approaches. I think one reviewer's comment summarizes it nicely in one line:

Andrew Ng got bored of improving one algorithm so he decided to improve all algorithms at once...
On his course website at Stanford, Ng provides some tutorials.

These approaches with DBNs --- let the algorithm find the features that make sense with respect to some basic principle of economy (whether it be sparsity or energy) --- makes me think about the recent opinion article by Malcolm Slaney, Does Content Matter? Of course content matters since that is how us expert humans "like" something, e.g., giving a thumbs up on YouTube... I "liked" Ng's talk because of its content and delivery, and not because 52 other people liked it. (Who are the two people that "disliked" Ng's talk!? How could someone not like him?) We just aren't using the best features.

Also note the recent "resurgence" of neural networks!

CRISSP is a research group in ADMT at Aalborg University Copenhagen (AAU-KBH), Denmark.


  Bob L. Sturm
  Sofia Dahl
  Stefania Serafin


CRISSP @ Medialogy

↑ Grab this Headline Animator

Powered by Movable Type 4.34-en