Hello, and welcome to the Paper of the Day (Po'D): Time-frequency Distributions Edition. Today's paper comes from Fall of 2009: S. Ghofrani and D. C. McLernon, "Auto-Wigner-Ville distribution via non-adaptive and adaptive signal decomposition", Signal Proc., vol. 89, no. 8, pp. 1540-1549, Aug. 2009.

It is typical when working with signals populating the world of non-stationary processes, like sampled audio, that one is interested in its ``time-frequency content.'' In other words, we can ask of a signal its frequency, or more generally its bandwidth, at some exact moment in time. Practically, this is useful for designing a parametric model of some signal. The authors of this paper propose estimating the instantaneous (and mean) frequency and bandwidth of a signal from what is called the ``auto-Wigner-Ville distribution,'' which they compute in two ways: 1) from the Gabor transform; and 2) from a matching pursuit (MP) decomposition.

We can look at the Wigner-Ville distribution (WVD) of an analytic function \(x(t)\) in two ways: 1) $$WV_x(t,\omega) := \int_{-\infty}^\infty \bigl [ x(t + \tau/2) x^*(t - \tau/2) \bigr ] e^{-j\omega \tau} d\tau$$ which is the Fourier transform of the product of \(x(t)\) with its time-reversed version; and 2) $$WV_x(t,\omega) := \langle x(t+\tau/2), e^{j\omega\tau} x(t - \tau/2) \rangle$$ which is just the projection of \(x(t+\tau/2)\) onto a modulated and

As the WVD involves the product of a signal with itself (it is quadratic), there exist what are called "cross-terms" that diminish its usefulness for time-frequency analysis. In other words, it becomes hard to interpret the results. Considering that we have a signal composed of several components: $$x(t) = \sum_{n=0}^\infty g_n(t)$$ its WVD will be $$WVD_x(t,\omega) = \sum_{n=0}^\infty WVD_{g_n}(t,\omega) + \mathop{\sum \sum}_{m=0, n=0, m\ne n}^\infty WVD_{g_n,g_m}(t,\omega)$$ where we see the latter double sum contains the WVDs of the interactions between components. Keeping only the first sum results in the "auto-WVD" (AWVD). It is with this function that the authors derive closed-form expressions estimating the instantaneous (and mean) frequency and bandwidth of a signal.

The first step given a signal is to find its unknown components with which to calculate the AWVD. The authors use two methods: 1) the Gabor transform; and 2) MP with a dictionary of Gabor atoms. The first method produces an extremely redundant signal expansion which we can sample and then add together the WVD of the modulated and shifted synthesis windows weighted by the expansion coefficients. With the second method, we get an extremely sparse signal expansion with which we can do the same thing (which we call ``wivigrams''). In the former case, we are restricted to one analysis window size; but in the latter case, we can use a dictionary of several resolutions. However, we must make a choice with MP when to stop the decomposition --- i.e., what the order of the model is. The authors appear to limit the order to around 32 atoms.

The figure above shows some of their results. At top is the WVD of a signal with two components that are modulated in frequency. To the left is the AWVD constructed from the Gabor transform, and below that are three estimates of the instantaneous (and mean) frequency. Finallly, on the right, is the AWVD constructed from MP, and the estimates found from this.

I do not quite understand the use of finding the instantaneous or mean frequency of a signal with more than one component; but regardless, the authors do not question the relevance of the atoms of a MP decomposition to providing good, or at least useful, information about the signal and its short-term statistics. This may not be apparent in the first few dozen iterations of the algorithm, but this negative behavior of MP and other greedy approaches to building signal models is well-documented.

It is typical when working with signals populating the world of non-stationary processes, like sampled audio, that one is interested in its ``time-frequency content.'' In other words, we can ask of a signal its frequency, or more generally its bandwidth, at some exact moment in time. Practically, this is useful for designing a parametric model of some signal. The authors of this paper propose estimating the instantaneous (and mean) frequency and bandwidth of a signal from what is called the ``auto-Wigner-Ville distribution,'' which they compute in two ways: 1) from the Gabor transform; and 2) from a matching pursuit (MP) decomposition.

We can look at the Wigner-Ville distribution (WVD) of an analytic function \(x(t)\) in two ways: 1) $$WV_x(t,\omega) := \int_{-\infty}^\infty \bigl [ x(t + \tau/2) x^*(t - \tau/2) \bigr ] e^{-j\omega \tau} d\tau$$ which is the Fourier transform of the product of \(x(t)\) with its time-reversed version; and 2) $$WV_x(t,\omega) := \langle x(t+\tau/2), e^{j\omega\tau} x(t - \tau/2) \rangle$$ which is just the projection of \(x(t+\tau/2)\) onto a modulated and

*time-reversed*version of itself. (The \(\tau/2\) part keeps the function symmetric, and consequently \(WV_x(t,\omega)\) always real --- but not always positive.) It has been shown that the WVD of a signl provides the most compact time-frequency description of its content, in that it has the least amount of smearing of "energy" in the time-frequency plane (I think from L. Cohen, "Time-frequency distributions -- A review," Proc. IEEE, vol. 77, pp. 941-981, July 1989). This is in contrast to, for instance, the short-term Fourier transform, where smearing in time is a by-product of windowing the data. The WVD does not use a window.As the WVD involves the product of a signal with itself (it is quadratic), there exist what are called "cross-terms" that diminish its usefulness for time-frequency analysis. In other words, it becomes hard to interpret the results. Considering that we have a signal composed of several components: $$x(t) = \sum_{n=0}^\infty g_n(t)$$ its WVD will be $$WVD_x(t,\omega) = \sum_{n=0}^\infty WVD_{g_n}(t,\omega) + \mathop{\sum \sum}_{m=0, n=0, m\ne n}^\infty WVD_{g_n,g_m}(t,\omega)$$ where we see the latter double sum contains the WVDs of the interactions between components. Keeping only the first sum results in the "auto-WVD" (AWVD). It is with this function that the authors derive closed-form expressions estimating the instantaneous (and mean) frequency and bandwidth of a signal.

The first step given a signal is to find its unknown components with which to calculate the AWVD. The authors use two methods: 1) the Gabor transform; and 2) MP with a dictionary of Gabor atoms. The first method produces an extremely redundant signal expansion which we can sample and then add together the WVD of the modulated and shifted synthesis windows weighted by the expansion coefficients. With the second method, we get an extremely sparse signal expansion with which we can do the same thing (which we call ``wivigrams''). In the former case, we are restricted to one analysis window size; but in the latter case, we can use a dictionary of several resolutions. However, we must make a choice with MP when to stop the decomposition --- i.e., what the order of the model is. The authors appear to limit the order to around 32 atoms.

The figure above shows some of their results. At top is the WVD of a signal with two components that are modulated in frequency. To the left is the AWVD constructed from the Gabor transform, and below that are three estimates of the instantaneous (and mean) frequency. Finallly, on the right, is the AWVD constructed from MP, and the estimates found from this.

I do not quite understand the use of finding the instantaneous or mean frequency of a signal with more than one component; but regardless, the authors do not question the relevance of the atoms of a MP decomposition to providing good, or at least useful, information about the signal and its short-term statistics. This may not be apparent in the first few dozen iterations of the algorithm, but this negative behavior of MP and other greedy approaches to building signal models is well-documented.