-->

Trouble Understanding Sliding Window for a column

2019-06-05 00:27发布

问题:

I am noob and i found very fragmentated information on stack on Slinding Window.

I have a mXn matrix, where m is fixed(latitude, longitude, ax, ay, az), n could change from different logs.

1) How can i create a sliding window only for az without extracting the vector and then analyzing it?

2) If i want to save all the rows where the az standard deviation go over a defined thresholds how can i do that?

3) If logs length is not fixed how can i deal with that? (ex. one file contains 932 rows, another 953)

4) I read a lot of questions, i am studying how bsxfun works in this case but is very unclear for me (in this examples i only undestood that a new matrix is created, based on the window size and then the new matrix is analyzed)(this last question is strongly related to my civil engineer background)


Here what i learned, and tried to aggregate.

Sliding Window is a powerful tool that allows to analyze a signal or an image. When I tried to describe to my girlfriend what I was doing I explained “Is like reading a book with a magnifier, the magnifier has a defined dimension and you analyze the text”

The basic way on Matlab, not the most efficient, to do that is

1. Define your vector dimensions

2. Define you window dimension

3. Define the number of steps

Here a basic example that i wrote

a= randi(100,[1,50]);        %Generic Vector
win_dim=3;                   %Generic window size

num_stps=(length(a)-win_dim) %number of "slides", we need to subtract win_dim to avoid that the window will go over the signal 
threshold=  15 %Generic number only for the example
for i= 1:num_stps
    mean_win(i)=mean(a(i:i+win_dim -1); %we subtract 1 or we make an error, and the code analyzes a segment bigger than one unit 
    std_win(i)=std( a(i:i+win_dim -1);  %example for i=2 if we don't subtract 1 our segment starts from 2 until 5, so we analyze 
                                        % 2 3 4 5, but we defined a window dimension of 3
    If std_win(i)> threshold
    std_anomalies=std_win(i)             %here i think there is an error                                
end     

This way the code slides over the entire signal, but windows will overlap.

How to decide the "overlap ratio" (or slide increment)?

We can define this like "how much informations two adjacent windows share?" The following examplei have done is completely wrong, but i tried to code something before asking here, the goal would have liked to be an overlap for Half of the segment or no overlap

%Half segment overlap

a= randi(100,[1,20]); %Generic Vector
win_dim=4;  %generic window size    
%v is the increment vector in our case we desire to have 50% of overlap 
l= win_dim
if  l%2==0
    v=l/2
else 
    v=(l+1)/2
end     

for i= 1:num_stps
    if (i==1)
    mean_win(i)=mean(a(1:1+win_dim -1);
    else 
    mean(i)= mean(a (i+v:i+win_dim+v-1);
end

回答1:

I like the creative approach to the question! :) Is this is what you are looking for?

a = rand(100, 5); % the data

window_size = 5; % size of the window
overlap = 2; % desired overlap
step = window_size - overlap; % compute the step

threshold = 0.3; % large std threshold

std_vals = NaN(size(a, 1), 1);
for i=1:step:(size(a, 1) - window_size)
    std_vals(i) = std(a(i:(i+window_size-1),5)); % calculate std of 5th col
end

% finding the rows with standard deviation larger than threshold
large_indexes = find(std_vals>threshold);

large_indexes will give you the starting row of the windows that have standard deviations larger than the threshold. std_vals will store all standard deviations for you, in case you need it later.

If you want indexes of all the rows in the window satisfying your threshold, you can add this at the end

large_windows = zeros(numel(large_indexes), window_size);
for i=1:window_size
    large_windows(:,i) = large_indexes + i - 1;
end

where each row of large_windows gives indexes of the rows in the window.