Lab 3.3 Convolution and filtering
Introduction
In this lab we continue our discussion of time series data.
Although we are discussing these topics in terms of time series, they all apply more generally for any sampled quantity. (You could imagine sampling in time, like a time series, or in space, like an image, or in other units as well.)
Smoothing a signal over time
Suppose you are performing a reaction in a test tube, and you are able to measure the reaction product with a photodiode because the reaction product gives off light (perhaps fluorescent light). You set up your photodiode to measure the formation of reaction product after beginning your reaction at time 0. The data are in chemical_reaction.txt attached at the end of the lab.
chem = load('chemical_reaction.txt');
figure;
plot(chem(:,1),chem(:,2));
ylabel('Reaction product (photocurrent uA/0.01 seconds)');
xlabel('Time(s)');
As you can see, the signal is rather noisy. Suppose you expected the signal to be noisy because the ability of the photodiode to record the low levels of light from the reaction is limited. You worry that the noise, which you know to be inherent system noise, might distract the reader from the information you want to convey, which is the time course of the reaction. It would be nice if you could smooth the system noise a bit to show more of the underlying phenomenon. (One can have a different discussion as to whether this noise reflects system noise or true signal; for each system the answers will vary; I'm assuming an artificial situation here where we know the source of the noise is in the physical limits of the system rather than in the signal itself.)
Convolution
A simple way to smooth the signal would be, for each sample, compute the mean of the following: the 1 sample that comes before it, that sample, and the sample that come after it. Therefore, in the smoothed signal, each point would represent the mean of 3 adjacent samples of the original data. (The number of samples to include would vary depending upon the dataset you are smoothing.)
So if the values of the signal are represented by Xk, the values of the filtered signal Yk are determined with the convolving filter as follows:
To compute the response at Yk, you slide the convolving filter on top of the signal centered on point Xk; then, you compute the product of the filter with all of the aligned points and add up the results to get the value of Yk.
You might ask, okay, it's obvious what happens at the middle, but what happens at the edges? Yes, edges are a problem. Typically, one creates "fake" samples to extend the original signal X out on both sides, choosing either 0 or the edge value. Matlab's function conv gives you the option to assume the function is 0 outside or to only compute the convolution over the range where data is available (see help conv). In a real research setting, you'll possibly need to extend your data manually if you want to repeat the edge value, or ignore the edges (we'll do this a little later.)
Boxcar
The smoothing "filter" that we proposed above (taking the mean of a few samples before, including, and after each sample in the signal) is traditionally called a boxcar (or pillbox) filter because, if you filled in the points around the filter with 0's, it would look like a boxcar or pillbox:
Computer graphics weren't as good when these things were being invented.
But let's smooth our data with a boxcar (take the local mean) and see how it looks. We need to be careful to normalize our boxcar; we don't want to change the scale of our signal when we convolve, so we need to make sure we divide the boxcar by its sum so that the sum over the boxcar is 1:
boxcar3 = [1 1 1]/3
boxcar5 = [1 1 1 1 1]/5
We can use the function ones, which will create a matrix of all ones, to do this more easily for us:
boxcar5 = ones(1,5)/5 % does the same thing as above
boxcar25 = ones(1,25)/25 % a longer-range boxcar
boxcar101 = ones(1,101)/101;
You can confirm that these boxcars sum to 1:
sum(boxcar25)
sum(boxcar5)
sum(boxcar101)
Now let's look at the result of the convolution. The input argument 'same' specifies that the output signal should be the same size as the input signal (you can see the options with help conv):
t = chem(:,1); % so we don't have to type chem(:,1) over and over
y3 = conv(chem(:,2),boxcar3,'same');
y5 = conv(chem(:,2),boxcar5,'same');
y25 = conv(chem(:,2),boxcar25,'same');
y101 = conv(chem(:,2),boxcar101,'same');
figure;
subplot(2,2,1);
plot(t,y3);
ylabel('Product');
xlabel('Time(s)');
title('Boxcar 3');
box off;
subplot(2,2,2);
plot(t,y5);
ylabel('Product');
xlabel('Time(s)');
title('Boxcar 5');
box off;
subplot(2,2,3);
plot(t,y25);
ylabel('Product');
xlabel('Time(s)');
title('Boxcar 25');
box off;
subplot(2,2,4);
plot(t,y101);
ylabel('Product');
xlabel('Time(s)');
title('Boxcar 101');
box off;
Q1: Which filtered version looks the "smoothest"? For all 4 graphs, zoom in on the points near time = 300 seconds. What happens in the cases with longer filters?
Dealing with the edge effects
No doubt you noticed that when the filter gets larger, there are larger edge effects. This occurs because Matlab's conv function only allows us to "pad" the edges of the signal with 0's. But, we have options.
We can ignore the data near the edges
We can use the last available sample as the "edge". For this situation, where the data has appeared to reach a steady-state by the end of our trace (that is, it is no longer changing), it makes good sense.
If the signal were periodic, one could "mirror" the edge by repeating the signal in reverse.
[If you think about it a little bit, you can probably imagine other scenarios.]
The safest option, which involves no assumptions about how the signal behaves outside of the times you sampled it, is to ignore the data that was generated by making an assumption about the data that falls off the edges. Matlab will return this if we use the 'valid' flag:
y3_ = conv(chem(:,2),boxcar3,'valid');
y5_ = conv(chem(:,2),boxcar5,'valid');
y25_ = conv(chem(:,2),boxcar25,'valid');
y101_ = conv(chem(:,2),boxcar101,'valid');
Let's look at the size of the vector that is returned, and compare it to the size of the filter and the original data:
size(chem(:,2))
size(boxcar101)
size(y101_)
So the 'valid' convolution is 100 elements shorter than the original signal. This makes sense, because the 'valid' region cannot begin until we hit sample 51 of our signal: the convolution with a 101 element filter requires knowledge of the previous 50 elements and the next 50 elements. So the time values for the 'valid' convolution correspond to sample numbers
1+(N-1)/2:length(signal)-(N-1)/2
Let's try this:
T_validindexes_101 = (1+(101-1)/2:length(t)-(101-1)/2);
size(T_validindexes_101)
T_validindexes_25 = (1+(25-1)/2:length(t)-(25-1)/2);
T_validindexes_5 = (1+(5-1)/2:length(t)-(5-1)/2);
T_validindexes_3 = (1+(3-1)/2:length(t)-(3-1)/2);
Let's plot these 'valid' convolutions with the appropriate times. This code is so similar to the code above, you can copy and paste if you want.
figure;
subplot(2,2,1);
plot(t(T_validindexes_3),y3_);
ylabel('Product');
xlabel('Time(s)');
title('Valid boxcar 3');
box off;
subplot(2,2,2);
plot(t(T_validindexes_5),y5_);
ylabel('Product');
xlabel('Time(s)');
title('Valid boxcar 5');
box off;
subplot(2,2,3);
plot(t(T_validindexes_25),y25_);
ylabel('Product');
xlabel('Time(s)');
title('Valid boxcar 25');
box off;
subplot(2,2,4);
plot(t(T_validindexes_101),y101_);
ylabel('Product');
xlabel('Time(s)');
title('Valid boxcar 101');
box off;
Sliding window functions
Convolution is effective when our samples are taken at precise regular intervals. It is particularly useful in imaging, where we have regularly spaced samples of brightness at different positions, and it is the basis of a lot of popular filtering algorithms that you might have played around with in your favorite image processing software ("blur", which is usually the boxcar, sharpening, Gaussian blur, etc). We'll study this again when we cover images.
However, sometimes we want to smooth a signal that is irregularly sampled, or we may want to find a smoothed average of the relationship between 2 variables. In these situations, the sliding window function is quite useful.
In the sliding window function, we create a small interval, called a window, and slide it across the data. We perform some function, such as taking the mean, on all of the points that fall in the window. Then we move the window over to the next step, and repeat.
Here's a function slidingwindowfunc.m that I wrote a few years ago to perform this operation. It employs a trick we haven't seen yet in the class, and we'll go through them. The help is longer than the function! I'd put this in your +signal package.
slidingwindowfunc.m
function [Yn,Xn] = slidingwindowfunc(X, Y, start, stepsize, stop, windowsize,func,zeropad)
% SLIDINGWINDOWFUNC - Sliding window analysis for 1-dimensional data
%
% [Yn,Xn] = dasw.signal.slidingwindowfunc(X,Y,START,STEPSIZE,STOP,WINDOWSIZE,...
% FUNC,ZEROPAD)
%
% Slides a window of WINDOWSIZE across the data and performs
% the function FUNC on the set of ordered pairs defined in
% X and Y. The window starts at location START and stops at
% location STOP on X. STEPSIZE determines how far the the
% window is advanced at each step.
%
% FUNC should be a string describing the function to be used.For example:
% 'mean', or 'median'.
%
% If ZEROPAD is 1, then a 0 is coded if no points are found within a
% given window.
% If ZEROPAD is 0, and if no points are found within a given window, no
% Xn or Yn point is added for that window.
%
% Xn is the center location of each window and Yn is the result of the
% function in each window.
Xn = []; Yn = [];
for i=start:stepsize:stop-windowsize, % step the window
INDs = find(X>=i&X<i+windowsize); % find all time points in window
if zeropad|~isempty(INDs),
Xn(end+1) = mean([i i+windowsize]); % window center location in Xn
end;
if ~isempty(INDs), % if we have values, evaluate the window function
eval(['Yn(end+1)=' func ' (Y(INDs));']);
y = Y(INDs)';
end;
if zeropad&isempty(INDs), Yn(end+1) = 0; end;
end;
To illustrate, let's compute the sliding mean of the chemical reaction data:
[Yw,Tw] = dasw.signal.slidingwindowfunc(chem(:,1),chem(:,2),0,1,300,2,'mean',0);
Here we have asked for a window of size 2; the first window center location will be 0, and the window will be slid along in steps of 1 second, stopping when we reach time = 300. So, the 2nd window will run from time 1 to 3, the 3rd window will run from time 2 to time 4, etc. We'll calculate the mean of all points in the window at each step; if for some reason we don't find any data in the window, we won't add a 0 value for that time point, we'll just exclude it all together. (See help dasw.signal.slidingwindowfunc)
figure;
plot(Tw,Yw);
xlabel('Time(s)');
ylabel('Product');
This formulation has the advantage that the smoothing is in units of time (window size is 2 seconds). If you changed the rate at which you sampled this data (in Hz), you wouldn't have to change your smoothing function if you wanted to smooth over the same interval (you would have to change the size of a convolution filter if you wanted to keep the same time profile of the filter). You can download my function stderr.m below (put it in +dasw/+stats), and then we can add standard error to this plot:
[Ywe,Tw] = dasw.signal.slidingwindowfunc(chem(:,1),chem(:,2),0,1,300,2,'dasw.stats.stderr',0);
You probably want to know: okay, how did we allow the user to specify the sliding window function that is evaluated using a string? To do this, we used the eval function, which tells Matlab to evaluate a command that is in a string variable. It is very useful for customizing your computations on the fly.
A = [1 2 3];
is the same as
eval('A=[1 2 3]')
Or, as I used it:
myfuncstr = 'mean'
mystringtoeval = ['B = ' myfuncstr '(A);']
eval(mystringtoeval);
B,
Sliding window functions applied to non-time series
A situation in which sliding window functions can be quite powerful is to explore the relationship between 2 variables when a relationship is present but not linear. For example, the file treeshrew_data.mat (attached at the end of the tutorial) has orientation selectivity data from 91 cells sampled at different depths in the tree shrew visual cortex. The index that I calculated is the OV index from lab 2.3.
load treeshrew_data.mat
figure;
plot(OVs,Depths,'ko');
ylabel('Standardized depth(mm)');
xlabel('Orientation vector index (OVI)');
set(gca,'ydir','reverse');
box off;
hold on;
plot([0 1],[0.950 0.950],'k--'); % boundary L2/3,4
plot([0 1],[1.350 1.350],'k--'); % boundary L4,L5/6
Here you can see that orientation selectivity varies across depth; if you stare at the figure long enough, you might be able to tell that the median orientation selectivity in layer 4 (where inputs arrive, indirectly, from the retina) is lower than in layer 2/3. Let's calculate a sliding median to help reveal this fact:
[On,Dn] = dasw.signal.slidingwindowfunc(Depths, OVs, 0, 0.01, 1.8, 0.2,'median',0);
plot(On,Dn,'k','linewidth',2)
Here we see, side by side, the behavior of the median cell at each depth (smoothed over 0.2 mm) and the variation that is observed.
The plot has been divided into 3 sections that correspond to the different physical layers of the cerebral cortex. The upper section shows cells in the upper layers that are closer to the surface (labeled layers 2/3). The middle section (between the dashed lines) corresponds to layer 4. The lower section shows cells that are the deepest (layers 5/6).
Q2: Which layer exhibits the greatest orientation vector index values? Do all cells in this layer exhibit high selectivity, or is there a lot of variation? Do cells in layer 4 generally exhibit high or low orientation vector index values? Is there variation? Is there any sub-layer organization that is evident here?
Removing certain frequencies from a time series -- filtering
One task that we often want to perform is to filter out particular frequencies from our sample. For example, we might be sampling neural activity at a high rate in order to be able to resolve spikes (which are only about 1 millisecond wide), and we might want to filter out lower frequency data that isn't related to spikes. Or, we might want to take a look at the signal at lower frequencies to look for the presence of delta waves (slow wave sleep waves, 0.1-4Hz), theta waves (4-10 Hz), and gamma waves (25-100Hz). The presence or absence of power in these bands can give clues as to whether a patient is sleeping or awake, or alert or resting.
We could come up with a scientific example of filtering, but let's do something more fun. In the World Cup Soccer Tournament in 2010 in South Africa, fans used a traditional horn called a Vuvuzela, which generates low frequency signals between about 200 Hz and 500 Hz. Low frequencies carry particularly well (this is why whales employ them, for example) and many fans found it difficult to understand the announcers because the sound of the Vuvuzelas carried into the broadcasting booths. So wouldn't it be nice if we could filter out these frequencies, leaving other frequencies intact?
Let's listen to the noise people complained about (not the best sample: the Vuvuzelas are not annoying enough. If anyone has a better sample that includes both Vuvuzelas and an announcer I'd love to get it!)
[v,sr] = audioread('Vuvuzela.wav'); % this file is attached below
sound(v);
figure;
spectrogram(v,1024,[],[],sr,'yaxis');
title('Vuvuzelas original');
These kinds of filters have a slightly different form than the filters used in convolution. There are 2 sets of coefficients, b and a; the nth element of the filtered signal, Y(n), is equal to:
a(1)*y(n) = b(1)*x(n) + b(2)*x(n-1) + ... + b(nb+1)*x(n-nb)
- a(2)*y(n-1) - ... - a(na+1)*y(n-na)
So the b coefficients multiply the signal points leading up to point n, while the a coefficients multiply the filter points up to point n. Since the filter is assumed to happen "live", these points only depend on what happens in the past, either in the actual signal x, or the filtered signal y.
Mathematicians have studied this problem and have derived several filters that provide excellent performance, but no filter can achieve the ideal result of exactly filtering out the frequencies you don't want while preserving the frequencies you do want. My favorite filters are called Butterworth filters and Chebyshev Type I filters.
We will go through the mechanics of creating and applying filters first, and then come back and study the trade-offs of the Butterworth and Chebyshev I filters. We'll employ Butterworth filters here because they have 1 fewer parameter than Chebyshev filters.
The function that creates a Butterworth filter, butter, takes 2 arguments:
[b,a] = butter(N, [W_low W_high])
N indicates the filter order; that is, the number of coefficients in the filter. The mathematics becomes unstable when filter orders get to be above 4 or 5, so a filter order of 4 is about the maximum that I choose (I often choose 4). Then W_low and W_high indicate the frequencies to use for the low and high frequency cut off. You might think that the units of these frequencies would be Hz, but they are not; instead, they are something more annoying. They are quantities from 0 to 1, where 0 indicates 0 frequency, and 1 indicates 1/2 of the sampling frequency. There is a mathematical reason why this is so, but I still find it inconvenient. Anyway, let's make our Butterworth filter for our Vuvuzela sample:
[b,a] = butter(4,[200 500]/(0.5 * sr));
In calling the function, I've divided the 200 and 500 by 1/2 of the sampling rate so we can provide the frequencies within the brackets in units of Hz.
Now we can filter the signal (using the function filtfilt) to only include frequencies between 200 and 500:
v2 = filtfilt(b,a,v);
Now we have the quantity that we _don't_ want, the signal between 200 and 500 Hz. Let's listen:
sound(v2,sr)
To get the quantity we do want, we can subtract this signal from the original:
sound(v-v2,sr)
figure;
spectrogram(v-v2,1024,[],[],sr,'yaxis');
title('Vuvuzelas removed');
Q3: Examine the original spectrogram of v and the spectrogram of v-v2. What is different?
It is also possible to create filters that accept any frequency below a particular cut-off frequency (called low pass filters), or filters that accept any frequency above a particular cut-off frequency (called high pass filters).
[b_l,a_l] = butter(4,[500]/(0.5 * sr),'low'); % 500Hz and below
v_l = filtfilt(b_l,a_l,v);
sound(v_l,sr);
figure;
spectrogram(v_l,1024,[],[],sr,'yaxis');
title('Vuvuzelas below 500Hz');
[b_h,a_h] = butter(4,[2000]/(0.5 * sr),'high'); % 2000Hz and above
v_h = filtfilt(b_h,a_h,v);
sound(v_h,sr);
figure;
spectrogram(v_h,1024,[],[],sr,'yaxis');
title('Vuvuzelas above 2kHz');
You can do this with music as well; your equalizer does essentially this except that it has multiple bands and can attenuate or remove signals as you specify.
Evaluating the performance of filters
Now let's examine exactly what is going on. To do this, we will create a test function called filtertransfer.m which computes the "transfer function" of the filter (which frequencies are passed, with what magnitude), as well as any phase modification of the signal. The function generates cosines at a number of frequencies, and examines 1) how much of that frequency makes it through the filter, and 2) if the phase of the cosine is altered as it passes through.
Let's look at an example.
freq = 200; % 200 Hz
t = 0:1/sr:(length(v)-1)/sr; % time of each sample
y = cos(2*pi*freq*t);
y2 = filter(b,a,y);
figure;
plot(t,y);
hold on;
plot(t,y2,'r');
xlabel('Time(s)');
ylabel('Signal');
Q4: Zoom way in. Do you see the cosine waves? Is the amplitude affected by the filter? What about the phase?
When filters reduce the amplitude of a signal, they generally introduce a phase lag as well. When we have digitally sampled data, though, we can perform a trick to get rid of these phase differences. We can filter the data forward and backward and take the average, and this exactly cancels any phase shifting. The function filtfilt does this. (Of course, we can only do this when we have the data sampled already; we cannot perform this on "live" data because we obviously can't run the filter backwards on data that has not been collected yet!)
y3 = filtfilt(b,a,y);
figure;
plot(t,y);
hold on;
plot(t,y3,'r');
xlabel('Time(s)');
ylabel('Signal');
Q5: Zoom way in. Do you see the cosine waves? Is the phase lag gone?
Let's use our test function (save to the +signal package):
filtertransfer.m
function [freqs,output, phaseshift] = filtertransfer(b,a,sr,N, filtfunc)
% FILTERTRANSFER - calculate transfer vs. frequency for filters
%
% [FREQS,OUTPUT,PHASESHIFT] = dasw.signal.filtertransfer(B,A,SR,N,FILTFUNC)
%
% Inputs:
% B and A are the coefficients for filter
% SR is the samplig rate
% N is the number of points to sample
% FILTFUNC should be a filter function such as 'filter' or 'filtfilt'
%
% Outputs: FREQS a list of 200 frequencies that will be examined
% OUTPUT the fraction of signal at that frequency that is passed
% PHASESHIFT the phase shift (if any) of the filter
t = 0:1/sr:(N-1)*1/sr;
% create 200 equally spaced samples in log space
freqs = logspace(log10(sr*1/N),log10(0.5*sr*(N-1)/N),200);
output = [];
phaseshift = [];
for i=1:length(freqs),
y = cos(2*pi*freqs(i)*t);
eval(['y2 = ' filtfunc '(b,a,y);']);
amp_cosine = 2*mean(y.*y2); % an fourier coefficient
amp_sine = 2*mean(sin(2*pi*freqs(i)*t).*y2); % bn fourier coefficient
output(i) = sqrt(amp_cosine.^2+amp_sine.^2); % total amplitude
phaseshift(i) = atan2(amp_sine,amp_cosine); % phase
end;
Let's compute the statistics of the Butterworth filter and a Chebychev I filter (using the function cheby1):
[b_bw,a_bw] = butter(4,[200 500]/(0.5 * sr));
[b_cb,a_cb] = cheby1(4,0.5,[200 500]/(0.5 * sr));
[f_bw,o_bw,p_bw]=dasw.signal.filtertransfer(b_bw,a_bw,sr,length(v),'filtfilt');%might take time
[f_cb,o_cb,p_cb]=dasw.signal.filtertransfer(b_cb,a_cb,sr,length(v),'filtfilt');%might take time
figure(100);
subplot(2,1,1);
hold on;
plot(f_bw,100*o_bw,'b');
plot(f_cb,100*o_cb,'r');
ylabel('Signal amplitude (%)');
legend('Butterworth','Chebychev');
subplot(2,1,2);
hold on;
plot(f_bw,p_bw,'b');
plot(f_cb,p_cb,'r');
ylabel('Signal phase');
legend('Butterworth','Chebychev');
Q6: Compare the behavior of the Butterworth and Chebychev filters. Which one drops off the fastest before 200 and after 500? Which one has the flattest behavior in the pass band (between 200-500Hz)? How is the phase performance in the pass band? Does the fact that the phase shifts are large outside of the pass band matter? What would the amplitude of the phase-shifted signals be?
If you're interested in learning more about these filters, Wikipedia has a nice comparison of the 4 most common filters here.
Function reference
Matlab functions and operators
User functions
dasw.signal.slidingwindowfunc.m
dasw.signal.filtertransfer.m
Labs are copyright 2011-2012, Stephen D. Van Hooser, all rights reserved.