CoSaMP and CoSaOMP

Przemek Dymarski writes,
I have noticed, that your COSAMP code exhibits an important difference from the original algorithm of Needell and Tropp: you recalculate K gains at the end of each iteration, but in the original algorithm the gains are left unchanged. Thus, you perform 2 pinv's per iteration and Needell and Tropp only one. Your version is in fact a variant od SP - the only difference is the number of vectors being merged (2K in SP, 3K in COSAMP). Of course your version is better (and slightly more costly) but is it still the COSAMP?
Przemek is absolutely correct!
My original description of CoSaMP was incomplete, so I have amended it. In particular, CoSaMP computes the new residual by $$\vr_{k+1} = \vu - \MPhi\MI_{\Omega_{k+1}}\vb$$ instead of $$\vr_{k+1} = \vu - \MPhi_{\Omega_{k+1}}^\dagger \vu.$$ In my code, which has already been twice corrected, I do the latter instead of the former. Thus, I have changed it, and provide the new code here: cosamp.m
function Sest = cosamp(Phi,u,K,tol,maxiterations)

Sest = zeros(size(Phi,2),1);
v = u;
t = 1;
numericalprecision = 1e-12;
T = [];
while (t <= maxiterations) && (norm(v)/norm(u) > tol)
y = abs(Phi'*v);
[vals,z] = sort(y,'descend');
Omega = find(y >= vals(2*K) & y > numericalprecision);
T = union(Omega,T);
b = pinv(Phi(:,T))*u;
[vals,z] = sort(abs(b),'descend');
Kgoodindices = (abs(b) >= vals(K) & abs(b) > numericalprecision);
T = T(Kgoodindices);
Sest = zeros(size(Phi,2),1);
b = b(Kgoodindices);
Sest(T) = b;
v = u - Phi(:,T)*b;
t = t+1;
end

With regards to the latter update of my code, this is nothing more than OMP; so I now call that algorithm CoSaOMP, and provide the code here: cosaomp.m
function Sest = cosaomp(Phi,u,K,tol,maxiterations)

Sest = zeros(size(Phi,2),1);
v = u;
t = 1;
numericalprecision = 1e-12;
T = [];
while (t <= maxiterations) && (norm(v)/norm(u) > tol)
y = abs(Phi'*v);
[vals,z] = sort(y,'descend');
Omega = find(y >= vals(2*K) & y > numericalprecision);
T = union(Omega,T);
b = pinv(Phi(:,T))*u;
[vals,z] = sort(abs(b),'descend');
Kgoodindices = (abs(b) >= vals(K) & abs(b) > numericalprecision);
T = T(Kgoodindices);
Sest = zeros(size(Phi,2),1);
phit = Phi(:,T);
b = pinv(phit)*u;
Sest(T) = b;
v = u - phit*b;
t = t+1;
end

I make bold the one major thing that is different between the two.

So, how do these two algorithms compare? I ran 100 independent trials using sparse vectors distributed Normally or Bernoulli in $$\{-1,1\}$$, sensed by matrices sampled from the uniform spherical ensemble. The ambient dimension is $$N=400$$, and I make sure all non-zero sparse vector elements have a magnitude of at least 1e-10. I set to 300 the maximum number of iterations of CoSaMP and CoSaOMP, and the stopping tolerance to 1e-5. I consider exact recovery to mean that the support set in the solution and original vector are equal. Here is the empirical phase transition of each algorithm for sparse vectors distributed Bernoulli.

Strange goings on up there. Here are the probabilities of exact recovery for five problem indeterminaces, as a function of sparsity, again for Bernoulli distributed signals. That jump there tells me something is not right with the CoSaOMP code. Maybe the Kgoodindices should be selected such that the set of dictionary vectors are linearly independent.

And here is the empirical phase transition of each algorithm for sparse vectors distributed Normal.

What the heck is that behavior? I saw something similar in my experiments in this article. But why on earth does taking more measurements and orthogonal projections not improve the phase transition? As above, here are the probabilities of exact recovery as a function of sparsity. Here we see a hard line limit for CoSaOMP.

What is the value of K in the parameters_

Bob L. Sturm, Associate Professor
Audio Analysis Lab
Aalborg University Copenhagen
A.C. Meyers Vænge 15
DK-2450 Copenahgen SV, Denmark
Email: bst_at_create.aau.dk