*20101125: Here is my code that you can use to reproduce the figures in our paper! Thanks for the motivation Igor! :)*

Hello, and welcome to Paper of the Day (Po'D): Cyclic Matching Pursuit with a Time-frequency Dictionary Edition. Today's paper is my own, just finalized for the proceedings of Asilomar 2010, and presented nary more than a few weeks ago: B. L. Sturm and M. G. Christensen, "Cyclic matching pursuits with multiscale time-frequency dictionaries," in Proc. Asilomar Conf. Signals, Systems, and Computers, (Pacific Grove, CA), Nov. 2010. I have discussed Cyclic Matching Pursuit (CMP) in a previous Po'D.

In the interest of extreme brevity, here is my one line description of the work in this paper:

Fusing cyclic minimization with MP using a multiscale time-frequency dictionary can produce signal models superior to those produced by more computationally complex greedy iterative descent approaches, such as OMP, with respect to the norm residual.

The idea is superbly simple: alternate between augmenting the model, and refining its elements.
Given some signal \(\vx\in \mathcal{C}^K\), and a dictionary \(\mathcal{D} := \{ \vd_i \in \mathcal{C}^K : || \vd_i || = 1\}\), and consider that we have already
an \(l\)-order representation \(\mathcal{X}_{l} = \{\MH(l), \va(l), \vr(l)\}\)
where \(\vr(l) := \vx - \MH(l)\va(l)\) is the residual,
\(\MH(l) = [\vh_0 | \vh_1 | \cdots | \vh_{l-1}]\) is our representation basis,
and \(\va(l) \in \mathcal{C}^l\) is a vector of our weights.
As in MP, we augment this representation by the rule
$$
\mathcal{X}_{l+1} = \left \{
\begin{array}{@{}r@{\;}l@{}}
\MH(l+1) = & [ \MH(l) | \vh_{l} ], \\
\va(l+1) = & \left [\va^T(l), \langle \vr(l), \vh_{l} \rangle \right ]^T, \\
\vr(l+1) = & \vx - \MH(l+1)\va(l+1)
\end{array}\right \}
$$
using the MP atom selection criterion (and Euclidean norm)
$$
\vh_{l} = \arg \min_{\vd \in \mathcal{D}} || \vr(l) - \langle \vr(l), \vd \rangle \vd ||
= \arg \max_{\vd \in \mathcal{D}} | \langle \vr(l), \vd \rangle |
$$
which guarantees that the residual norm is non-increasing.
(When the dictionary is complete, this also guarantees the residual norm is monotonic decreasing and converges to 0.)

With this augmented model, we now refine the atoms in the following way. In the first refinement step for the \((i+1)\)th atom (\(0 \le i \le l\)) of the representation basis, which is \(\vh_i \in \mathcal{H}_l^{(0)}\), we replace \(\vh_i\) using MP but with a different residual: $$ \vh_i^{(1)} = \arg \min_{\vd \in \mathcal{D}} \left | \left | \vr_{l\backslash i} - \langle \vr_{l\backslash i}, \vd \rangle \vd \right | \right | $$ where we define $$\begin{align} \vr_{l\backslash i} & := \vx - \sum_{m=0}^{i-1} a_m^{(1)} \vh_m^{(1)} - \sum_{m=i+1}^{l} a_m^{(0)} \vh_m^{(0)} \\ & = \vr_{l\backslash i-1} + a_i^{(0)} \vh_i^{(0)}. \end{align}$$ We do the same for all of the atoms in \(\mathcal{H}_l^{(0)}\) to produce \(\mathcal{H}_l^{(1)}\). The worst we can do in any refinement step is to reselect the same atom, and thus we are guaranteed that the norm residual will never increase for any number of refinements cycles, and so we can refine our \((l+1)\)th-order model until some stopping criterion is met. Then we augment the model with a new atom, and repeat this refinement cycle again. It is that easy! (My code is below.)

With this augmented model, we now refine the atoms in the following way. In the first refinement step for the \((i+1)\)th atom (\(0 \le i \le l\)) of the representation basis, which is \(\vh_i \in \mathcal{H}_l^{(0)}\), we replace \(\vh_i\) using MP but with a different residual: $$ \vh_i^{(1)} = \arg \min_{\vd \in \mathcal{D}} \left | \left | \vr_{l\backslash i} - \langle \vr_{l\backslash i}, \vd \rangle \vd \right | \right | $$ where we define $$\begin{align} \vr_{l\backslash i} & := \vx - \sum_{m=0}^{i-1} a_m^{(1)} \vh_m^{(1)} - \sum_{m=i+1}^{l} a_m^{(0)} \vh_m^{(0)} \\ & = \vr_{l\backslash i-1} + a_i^{(0)} \vh_i^{(0)}. \end{align}$$ We do the same for all of the atoms in \(\mathcal{H}_l^{(0)}\) to produce \(\mathcal{H}_l^{(1)}\). The worst we can do in any refinement step is to reselect the same atom, and thus we are guaranteed that the norm residual will never increase for any number of refinements cycles, and so we can refine our \((l+1)\)th-order model until some stopping criterion is met. Then we augment the model with a new atom, and repeat this refinement cycle again. It is that easy! (My code is below.)

Above we see for four different signals, how the residual energy decays for several different greedy iterative descent algorithms using the same multiscale time-frequency dictionary. (LoCOMP I have written about before. OLS is orthogonal least squares, which is AKA "optimized orthogonal MP". And COLS is a cyclic version of OLS that we describe in this paper as well.)
CMP does surprisingly well, surpassing the more computationally complex OMP. COLS does the best (w.r.t. norm residual) out of all of them, but has the highest computational complexity. Of course, we are not at the scale of audio signals yet; but it is a matter of time before we implement CMP within the MPTK framework.

Get my MATLAB code here.

Get my MATLAB code here.