Detecting patterns in waves

2019-03-07 19:55发布

I'm trying to read a image from a electrocardiography and detect each one of the main waves in it (P wave, QRS complex and T wave). Now I can read the image and get a vector like (4.2; 4.4; 4.9; 4.7; ...) representative of the values in the electrocardiography, what is half of the problem. I need a algorithm that can walk through this vector and detect when each of this waves start and end.

Here is a example of one of its graphs:

alt text

Would be easy if they always had the same size, but it's not like it works, or if I knew how many waves the ecg would have, but it can vary too. Does anyone have some ideas?

Thanks!

Updating

Example of what I'm trying to achieve:

Given the wave

alt text

I can extract the vector

[0; 0; 20; 20; 20; 19; 18; 17; 17; 17; 17; 17; 16; 16; 16; 16; 16; 16; 16; 17; 17; 18; 19; 20; 21; 22; 23; 23; 23; 25; 25; 23; 22; 20; 19; 17; 16; 16; 14; 13; 14; 13; 13; 12; 12; 12; 12; 12; 11; 11; 10; 12; 16; 22; 31; 38; 45; 51; 47; 41; 33; 26; 21; 17; 17; 16; 16; 15; 16; 17; 17; 18; 18; 17; 18; 18; 18; 18; 18; 18; 18; 17; 17; 18; 19; 18; 18; 19; 19; 19; 19; 20; 20; 19; 20; 22; 24; 24; 25; 26; 27; 28; 29; 30; 31; 31; 31; 32; 32; 32; 31; 29; 28; 26; 24; 22; 20; 20; 19; 18; 18; 17; 17; 16; 16; 15; 15; 16; 15; 15; 15; 15; 15; 15; 15; 15; 15; 14; 15; 16; 16; 16; 16; 16; 16; 16; 16; 16; 15; 16; 15; 15; 15; 16; 16; 16; 16; 16; 16; 16; 16; 15; 16; 16; 16; 16; 16; 15; 15; 15; 15; 15; 16; 16; 17; 18; 18; 19; 19; 19; 20; 21; 22; 22; 22; 22; 21; 20; 18; 17; 17; 15; 15; 14; 14; 13; 13; 14; 13; 13; 13; 12; 12; 12; 12; 13; 18; 23; 30; 38; 47; 51; 44; 39; 31; 24; 18; 16; 15; 15; 15; 15; 15; 15; 16; 16; 16; 17; 16; 16; 17; 17; 16; 17; 17; 17; 17; 18; 18; 18; 18; 19; 19; 20; 20; 20; 20; 21; 22; 22; 24; 25; 26; 27; 28; 29; 30; 31; 32; 33; 32; 33; 33; 33; 32; 30; 28; 26; 24; 23; 23; 22; 20; 19; 19; 18; 17; 17; 18; 17; 18; 18; 17; 18; 17; 18; 18; 17; 17; 17; 17; 16; 17; 17; 17; 18; 18; 17; 17; 18; 18; 18; 19; 18; 18; 17; 18; 18; 17; 17; 17; 17; 17; 18; 17; 17; 18; 17; 17; 17; 17; 17; 17; 17; 18; 17; 17; 18; 18; 18; 20; 20; 21; 21; 22; 23; 24; 23; 23; 21; 21; 20; 18; 18; 17; 16; 14; 13; 13; 13; 13; 13; 13; 13; 13; 13; 12; 12; 12; 16; 19; 28; 36; 47; 51; 46; 40; 32; 24; 20; 18; 16; 16; 16; 16; 15; 16; 16; 16; 17; 17; 17; 18; 17; 17; 18; 18; 18; 18; 19; 18; 18; 19; 20; 20; 20; 20; 20; 21; 21; 22; 22; 23; 25; 26; 27; 29; 29; 30; 31; 32; 33; 33; 33; 34; 35; 35; 35; 0; 0; 0; 0;]

I would like to detect, for example

P wave in [19 - 37]

QRS complex in [51 - 64]

etc...

13条回答
Summer. ? 凉城
2楼-- · 2019-03-07 20:32

"Wavelet transform" may be a relevant keyword. I've once attended a presentation by someone who used this technique to detect different heartbeat phases in a noisy ecg.

As far as my limited understanding goes, it's somewhat like a Fourier transform, but using (scaled) copies of a, in your case heartbeat-shaped, pulse.

查看更多
我只想做你的唯一
3楼-- · 2019-03-07 20:33

Wavelets have been shown to be the best tool for locating peaks in this type of data where the peaks are "different sizes" - the scaling properties of wavelets make it an ideal tool for this type of multi-scale peak detection. This looks like a non-stationary signal so using a DFT would not be the right tool as some have suggested, but if this is an exploratory project you could look at using the spectrum of the signal (estimated using essentially the FFT of the autocorrelation of the signal.)

Here is a great paper reviewing several peak detection methods - this would be a good place to start.

-Paul

查看更多
We Are One
4楼-- · 2019-03-07 20:35

The first thing I would do is simplify the data.

Instead of analyzing absolute data, analyze the amount of change from one data point to the next.

Here is a quick one liner that will take ; separated data as input, and output the delta of that data.

perl -0x3b -ple'( $last, $_ ) = ( $_, $_-$last )' < test.in > test.out

Running it on the data you provided, this is the output:

0;0;20;0;0;-1;-1;-1;0;0;0;0;-1;0;0;0;0;0;0;1;0;1;1;1;1;1;1;0;0;2;0;-2;-1;-2;-1;-2;-1;0;-2;-1;1;-1;0;-1;0;0;0; 0;-1;0;-1;2;4;6;9;7;7;6;-4;-6;-8;-7;-5;-4;0;-1;0;-1;1;1;0;1;0;-1;1;0;0;0;0;0;0;-1;0;1;1;-1;0;1;0;0;0;1;0;-1;1; 2;2;0;1;1;1;1;1;1;1;0;0;1;0;0;-1;-2;-1;-2;-2;-2;-2;0;-1;-1;0;-1;0;-1;0;-1;0;1;-1;0;0;0;0;0;0;0;0;-1;1;1;0;0;0; 0;0;0;0;0;-1;1;-1;0;0;1;0;0;0;0;0;0;0;-1;1;0;0;0;0;-1;0;0;0;0;1;0;1;1;0;1;0;0;1;1;1;0;0;0;-1;-1;-2;-1;0;-2;0; -1;0;-1;0;1;-1;0;0;-1;0;0;0;1;5;5;7;8;9;4;-7;-5;-8;-7;-6;-2;-1;0;0;0;0;0;1;0;0;1;-1;0;1;0;-1;1;0;0;0;1;0;0;0; 1;0;1;0;0;0;1;1;0;2;1;1;1;1;1;1;1;1;1;-1;1;0;0;-1;-2;-2;-2;-2;-1;0;-1;-2;-1;0;-1;-1;0;1;-1;1;0;-1;1;-1;1;0;-1; 0;0;0;-1;1;0;0;1;0;-1;0;1;0;0;1;-1;0;-1;1;0;-1;0;0;0;0;1;-1;0;1;-1;0;0;0;0;0;0;1;-1;0;1;0;0;2;0;1;0;1;1;1;-1; 0;-2;0;-1;-2;0;-1;-1;-2;-1;0;0;0;0;0;0;0;0;-1;0;0;4;3;9;8;11;4;-5;-6;-8;-8;-4;-2;-2;0;0;0;-1;1;0;0;1;0;0;1;-1; 0;1;0;0;0;1;-1;0;1;1;0;0;0;0;1;0;1;0;1;2;1;1;2;0;1;1;1;1;0;0;1;1;0;0;-35;0;0;0;

There are new-lines inserted in the above text not originally present in the output.


After you have done that it is trivial to find the qrs complex.

perl -F';' -ane'@F = map { abs($_) > 2 and $_ } @F; print join ";", @F'< test.out

;;20;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;4;6;9;7;7;6;-4;-6;-8;-7;-5;-4;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;5;5;7;8;9;4;-7;-5;-8;-7;-6
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;4;3;9;8;11;4;-5;-6;-8;-8;-4;
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;-35;;;

The 20 and -35 data points result from the original data starting and ending with 0.

To find the other data points you will have to rely on pattern matching.


If you look at the first p wave, you can clearly see a pattern.

0;0;0;0;0;0;1;0;1;1;1;1;1;1;0;0;2;0;-2;-1;-2;-1;-2;-1;0;-2;-1;1;-1;0;-1;0;0;0;0;
#           \________ up _______/   \________ down _________/

It isn't as easy to see the pattern on the second p wave though. This is because the second one is spread out further

0;0;0;1;0;1;1;0;1;0;0;1;1;1;0;0;0;-1;-1;-2;-1;0;-2;0;-1;0;-1;0;1;-1;0;0;-1;0;0;0;
#     \________ up _______/       \________________ down ________________/

The third p wave is a little more erratic than the other two.

0;0;0;0;0;1;-1;0;1;0;0;2;0;1;0;1;1;1;-1;0;-2;0;-1;-2;0;-1;-1;-2;-1;0;0;0;0;0;
#                \_______ up ______/  \__________ down __________/

You would find the t waves in a similar manner to the p waves. The main difference is when they occur.


This should be enough information to get you started.

The two one-liners are for example only, not recommended for daily use.

查看更多
女痞
5楼-- · 2019-03-07 20:40

I haven't read each other answer thoroughly but I have scanned them and I noticed that no one recommended looking at the Fourier Transform to segment these waves.

To me it seems like a clear cut application of Harmonic analysis in mathematics. There may be several subtle points that I may be missing.

The Discrete Fourier Transform coefficients give you the amplitude and phase of the different sinusoidal components that make up your discrete time signal, which is essentially what your problem states you want to find.

I may be missing something here though ...

查看更多
成全新的幸福
6楼-- · 2019-03-07 20:44

A piece of this puzzle is "onset detection" and a number of complex algorithms have been written to solve this problem. Here is more information on onsets.

The next piece is a Hamming Distance. This algorithms allow you to make fuzzy comparisons, the input is 2 arrays and the output is an integer "distance" or difference between the 2 data sets. The smaller the number, the more alike the 2 are. This is very close to what you need, but its not exact. I went ahead and made some modifications to the Hamming Distance algorithm to calculate a new distance, it probably has a name but i don't know what it is. Basically it adds up the absolute distance between each element in the array and returns the total. Here is the code for it in python.

import math

def absolute_distance(a1, a2, length):
       total_distance=0
       for x in range(0,length):
               total_distance+=math.fabs(a1[x]-a2[x])
       return total_distance

print(absolute_distance([1,3,9,10],[1,3,8,11],4))

This script outputs 2, which is the distance between these 2 arrays.

Now for putting together these pieces. You could use Onset detection to find the beginning of all waves in the data set. You can then loop though these location comparing each wave with a sample P-Wave. If you hit a QRS Complex the distance is going to be the largest. If you hit another P-Wave the number isn't going to be zero, but its going to be much smaller. The distance between any P-Wave and any T-Wave is going to be pretty small, HOWEVER this isn't a problem if you make the following assumption:

The distance between any p-wave and any other p-wave will be smaller than the distance between any p-wave and any t-wave.

The series looks something like this: pQtpQtpQt... The p-wave and t-wave is right next to each other, but because this sequence is predictable it will be easier to read.

On a side not, there is probably a calculus based solution to this problem. However in my mind curve fitting and integrals make this problem more of a mess. The distance function I wrote will find the area difference which is very similar subtracting the integral of both curves.

It maybe possible to sacrifice the onset calculations in favor of iterating by 1 point at a time and thus performing O(n) distance calculations, where n is the number of points in the graph. If you had a list of all of these distance calculations and knew there where 50 pQt sequences then you would know the 50 shortest distances that do not overlap where all locations of p-waves. Bingo! how is that for simplicity? However the trade off is loss of efficiency due to an increased number of distance calculations.

查看更多
Explosion°爆炸
7楼-- · 2019-03-07 20:45

You can use cross-correlation. Take a model sample of each pattern and correlate them with the signal. You will get peaks where the correlation is high. I would expect good results with this technique extracting qrs and t waves. After that, you can extract p waves by looking for peaks on the correlation signal that are before qrs.

Cross-correlation is a pretty easy to implement algorithm. Basically:

x is array with your signal of length Lx
y is an array containing a sample of the signal you want to recognize of length Ly
r is the resulting correlation

for (i=0; i<Lx - Ly; i++){
  r[i] = 0;
  for (j=0; j<Ly ; j++){
    r[i] += x[i+j]*y[j];
  }
}

And look for peaks in r (values over a threshold, for instance)

查看更多
登录 后发表回答