Matlab filter electical spikes in accelerometric d

2020-02-16 01:35发布

问题:

I have a dataset of accelerometric data that is affected by electical spikes.

I'm looking for a good method to filter out or reduce these spikes as need to calculate on these data a rolling window of FFT and other statistical indicators such as kurtosis and skewness. I can't simply delete these outliers or replace them with NaN. Sampling 2000[hz]

Until now I've tried on MATLAB 2012b:

  • Wavelet denoising (Haar wavelet)
  • Median Filter
  • Despike and iterpolate approach

Can you suggest a proper approach to deal with these data?

Download example dataset

回答1:

I would suggest some local smoothing. By defining thresholds and averaging all values below and above.

Af = data.example1;
% Thresholds
Tl = -0.6;
To = 0.6;

peaks = find( Af < Tl | Af > To);
Af(peaks) = ( Af(peaks-1) + Af(peaks+1) ) / 2;

The problem with this approach is that your outliners sometimes consist of up to 6 samples. So you need to smooth in multiple steps using a while loop:

Af = data.example1;
% Thresholds
Tl = -0.6;
To = 0.6;

% initialisation
peaks = find( Af < Tl | Af > To);
counter = 0;

while ~isempty(peaks)
    peaks = find( Af < Tl | Af > To);
    Af(peaks) = ( Af(peaks-1) + Af(peaks+1) ) / 2;
    counter=counter+1;
end

after 6 iterations you get the following result:



回答2:

I have used the file despiking from the matlab central file exchange with very good effect for similar problems, though I see you've tried that as well.

Another approach I've taken is to treat the spikes as statistical outliers and removed them using this function which uses Rosner's many outlier test. (NIST site is down for obvious reasons, so here is the Google cached version)

Edited to add: I was mistaken. My despiking algorithm did not come from the file exchange function I linked to above. It was actually pulled out of a journal article (the code is listed in the supplementary information to the paper, but they didn't publish the code to the file exchange). The paper was:

Practical Methods for Noise Removal: Applications to Spikes, Nonstationary Quasi-Periodic Noise, and Baseline Drift

Delphine Feuerstein , Kim H. Parker and Martyn G. Boutelle

Anal. Chem., 2009, 81 (12), pp 4987–4994

Since the copyright is held by the American Chemical Society and the authors, I can't copy the code here, but if you have access to a university library account, you can download a copy. If you don't, I left the link to the file exchange version, but I haven't used it so I can't vouch for its efficacy.



回答3:

A moderator merged this question with this question - that's why it looks a little messy here. This answer considers additional issues in the second question!

The following is not an entirely clean solution, the code is adopted from my previous answer, but I added an exception for your case, so you don't need to delete values at the beginning and/or end of your data manually. It discards only these invalid values, that shouldn't cause problems.

Af = csvread(strcat('example_data_with_peak.txt'),5,0); 

% Thresholds
Tl = -0.04;
To = 0.04;

% initialisation
peaks = find( Af < Tl | Af > To);
counter = 0;

while ~isempty(peaks)
    peaks = find( Af < Tl | Af > To);
    try
        Af(peaks) = ( Af(peaks-1) + Af(peaks+1) ) / 2;
    catch
        if peaks(1) == 1
            Af(1) = 0;
        else
            Af(end) = 0;
        end
    end   
    counter=counter+1;
end

figure(2);
plot(Af)

For determining the threshold you could use somethink like this, but it's also quite brute-force:

thresh = 15*mean(abs(findpeaks(Af)));


回答4:

For others that may need this here's what I ended up using. Here's the data file data file link

Thanks goes to @thewaywewalk

Matlab filter electical spikes in accelerometric data

clear all, clc,clf,tic
aa=csvread(strcat('/tmp/example_data_with_peak.txt'),5,0); %will skip the first 5 rows that are text and zeros
figure(1);
plot(aa)
Af=aa;
% Thresholds
Tl = -mean(abs(aa))*10
To =mean(abs(aa))*10

% initialisation
[peaks_r,peaks_c] = find( Af < Tl | Af > To);
peaks = find( Af < Tl | Af > To);

counter = 0;

while ~isempty(peaks)
    peaks = find( Af < Tl | Af > To);
    try
        Af(peaks) = ( Af(peaks-1) + Af(peaks+1) ) / 2;
    catch
        if peaks(1) == 1
            Af(1) = 0;
        else
            Af(end) = 0;
        end
    end   
    counter=counter+1;
end
counter
figure(2);
plot(Af)

Here are the images of before and after.



回答5:

I find that for the specific problem of one-point spikes (which arises in CCD detectors when a cosmic ray discharges a single CCD cell during the exposure) the following algorithm works very well:

  N=length(y);
  for i=[3:1:N-2]
    # calculate the means of two nearest neighbours, and two next-nearest neighbours
    y1=(y(i-1)+y(i+1))/2;
    y2=(y(i-2)+y(i+2))/2;
    # if those two means are close, but the current point is far off, it's a spike
    if ( abs(y2-y(i)) > cutoff && abs(y1-y2) < cutoff)
       z(i)=y2;
    endif
  endfor

Selecting the best strategy for a good cutoff selection is a separate issue; I tend to have it set to a fixed value based on the typical dark counts in the CCD. One can also use separate levels for what is "close" and what is "far off", like this:

    if ( abs(y2-y(i)) > cutoff_far && abs(y1-y2) < cutoff_close )

One can also select a different criterion, like the difference between two means is X times smaller than the difference with the spike data:

    if ( abs(y2-y(i)) > 10*abs(y1-y2) )

Peaks that are wider than a single-point spike survive this process unmolested.

An example of de-spiked Raman spectrum using a CCD detector