Pattern recognition in time series [closed]

2楼-- · 2019-01-12 15:09

I'm not sure what package would work best for this. I did something similar at one point in college where I tried to automatically detect certain similar shapes on an x-y axis for a bunch of different graphs. You could do something like the following.

Class labels like:

no class
start of region
middle of region
end of region

Features like:

relative y-axis relative and absolute difference of each of the surrounding points in a window 11 points wide
Features like difference from average
Relative difference between point before, point after

0人赞添加讨论(0) 举报

我欲成王，谁敢阻挡

3楼-- · 2019-01-12 15:12

I am using deep learning if it's an option for you. It's done in Java, Deeplearning4j. I am experimenting with LSTM. I tried 1 hidden layer and 2 hidden layers to process time series.

return new NeuralNetConfiguration.Builder()
                .seed(HyperParameter.seed)
                .iterations(HyperParameter.nItr)
                .miniBatch(false)
                .learningRate(HyperParameter.learningRate)
                .biasInit(0)
                .weightInit(WeightInit.XAVIER)
                .momentum(HyperParameter.momentum)
                .optimizationAlgo(
                        OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT  // RMSE: ????
                )
                .regularization(true)
                .updater(Updater.RMSPROP) // NESTEROVS
                // .l2(0.001)
                .list()
                .layer(0,
                        new GravesLSTM.Builder().nIn(HyperParameter.numInputs).nOut(HyperParameter.nHNodes_1).activation("tanh").build())
                .layer(1,
                        new GravesLSTM.Builder().nIn(HyperParameter.nHNodes_1).nOut(HyperParameter.nHNodes_2).dropOut(HyperParameter.dropOut).activation("tanh").build())
                .layer(2,
                        new GravesLSTM.Builder().nIn(HyperParameter.nHNodes_2).nOut(HyperParameter.nHNodes_2).dropOut(HyperParameter.dropOut).activation("tanh").build())
                .layer(3, // "identity" make regression output
                        new RnnOutputLayer.Builder(LossFunctions.LossFunction.MSE).nIn(HyperParameter.nHNodes_2).nOut(HyperParameter.numOutputs).activation("identity").build()) // "identity"
                .backpropType(BackpropType.TruncatedBPTT)
                .tBPTTBackwardLength(100)
                .pretrain(false)
                .backprop(true)
                .build();

Found a few things:

LSTM or RNN is very good at picking out patterns in time-series.
Tried on one time-series, and a group different time-series. Pattern were picked out easily.
It is also trying to pick out patterns not for just one cadence. If there are patterns by week, and by month, both will be learned by the net.

0人赞添加讨论(0) 举报

放我归山

4楼-- · 2019-01-12 15:13

Weka is a powerful collection of machine-learning software, and supports some time-series analysis tools, but I do not know enough about the field to recommend a best method. However, it is Java-based; and you can call Java code from C/C++ without great fuss.

Packages for time-series manipulation are mostly directed at the stock-market. I suggested Cronos in the comments; I have no idea how to do pattern recognition with it, beyond the obvious: any good model of a length of your series should be able to predict that, after small bumps at a certain distance to the last small bump, big bumps follow. That is, your series exhibits self-similarity, and the models used in Cronos are designed to model it.

If you don't mind C#, you should request a version of TimeSearcher2 from the folks at HCIL - pattern recognition is, for this system, drawing what a pattern looks like, and then checking whether your model is general enough to capture most instances with a low false-positive rate. Probably the most user-friendly approach you will find; all others require quite a background in statistics or pattern recognition strategies.

0人赞添加讨论(0) 举报

SAY GOODBYE

5楼-- · 2019-01-12 15:19

Why not using a simple matched filter? Or its general statistical counterpart called cross correlation. Given a known pattern x(t) and a noisy compound time series containing your pattern shifted in a,b,...,z like y(t) = x(t-a) + x(t-b) +...+ x(t-z) + n(t). The cross correlation function between x and y should give peaks in a,b, ...,z

0人赞添加讨论(0) 举报

倾城　Initia

6楼-- · 2019-01-12 15:21

Here is a sample result from a small project I did to partition ecg data.

enter image description here

My approach was a "switching autoregressive HMM" (google this if you haven't heard of it) where each datapoint is predicted from the previous datapoint using a Bayesian regression model. I created 81 hidden states: a junk state to capture data between each beat, and 80 separate hidden states corresponding to different positions within the heartbeat pattern. The pattern 80 states were constructed directly from a subsampled single beat pattern and had two transitions - a self transition and a transition to the next state in the pattern. The final state in the pattern transitioned to either itself or the junk state.

I trained the model with Viterbi training, updating only the regression parameters.

Results were adequate in most cases. A similarly structure Conditional Random Field would probably perform better, but training a CRF would require manually labeling patterns in the dataset if you don't already have labelled data.

Edit:

Here's some example python code - it is not perfect, but it gives the general approach. It implements EM rather than Viterbi training, which may be slightly more stable. The ecg dataset is from http://www.cs.ucr.edu/~eamonn/discords/ECG_data.zip

import numpy as np
import numpy.random as rnd
import matplotlib.pyplot as plt 
import scipy.linalg as lin
import re

data=np.array(map(lambda l: map(float,filter(lambda x: len(x)>0,re.split('\\s+',l))),open('chfdb_chf01_275.txt'))).T
dK=230
pattern=data[1,:dK]
data=data[1,dK:]

def create_mats(dat):
    '''
    create 
        A - an initial transition matrix 
        pA - pseudocounts for A
        w - emission distribution regression weights
        K - number of hidden states
    '''
    step=5 #adjust this to change the granularity of the pattern
    eps=.1
    dat=dat[::step]
    K=len(dat)+1
    A=np.zeros( (K,K) )
    A[0,1]=1.
    pA=np.zeros( (K,K) )
    pA[0,1]=1.
    for i in xrange(1,K-1):
        A[i,i]=(step-1.+eps)/(step+2*eps)
        A[i,i+1]=(1.+eps)/(step+2*eps)
        pA[i,i]=1.
        pA[i,i+1]=1.
    A[-1,-1]=(step-1.+eps)/(step+2*eps)
    A[-1,1]=(1.+eps)/(step+2*eps)
    pA[-1,-1]=1.
    pA[-1,1]=1.

    w=np.ones( (K,2) , dtype=np.float)
    w[0,1]=dat[0]
    w[1:-1,1]=(dat[:-1]-dat[1:])/step
    w[-1,1]=(dat[0]-dat[-1])/step

    return A,pA,w,K

#initialize stuff
A,pA,w,K=create_mats(pattern)

eta=10. #precision parameter for the autoregressive portion of the model 
lam=.1 #precision parameter for the weights prior 

N=1 #number of sequences
M=2 #number of dimensions - the second variable is for the bias term
T=len(data) #length of sequences

x=np.ones( (T+1,M) ) # sequence data (just one sequence)
x[0,1]=1
x[1:,0]=data

#emissions
e=np.zeros( (T,K) )
#residuals
v=np.zeros( (T,K) )

#store the forward and backward recurrences
f=np.zeros( (T+1,K) )
fls=np.zeros( (T+1) )
f[0,0]=1
b=np.zeros( (T+1,K) )
bls=np.zeros( (T+1) )
b[-1,1:]=1./(K-1)

#hidden states
z=np.zeros( (T+1),dtype=np.int )

#expected hidden states
ex_k=np.zeros( (T,K) )

# expected pairs of hidden states
ex_kk=np.zeros( (K,K) )
nkk=np.zeros( (K,K) )

def fwd(xn):
    global f,e
    for t in xrange(T):
        f[t+1,:]=np.dot(f[t,:],A)*e[t,:]
        sm=np.sum(f[t+1,:])
        fls[t+1]=fls[t]+np.log(sm)
        f[t+1,:]/=sm
        assert f[t+1,0]==0

def bck(xn):
    global b,e
    for t in xrange(T-1,-1,-1):
        b[t,:]=np.dot(A,b[t+1,:]*e[t,:])
        sm=np.sum(b[t,:])
        bls[t]=bls[t+1]+np.log(sm)
        b[t,:]/=sm

def em_step(xn):
    global A,w,eta
    global f,b,e,v
    global ex_k,ex_kk,nkk

    x=xn[:-1] #current data vectors
    y=xn[1:,:1] #next data vectors predicted from current
    #compute residuals
    v=np.dot(x,w.T) # (N,K) <- (N,1) (N,K)
    v-=y
    e=np.exp(-eta/2*v**2,e)

    fwd(xn)
    bck(xn)

    # compute expected hidden states
    for t in xrange(len(e)):
        ex_k[t,:]=f[t+1,:]*b[t+1,:]
        ex_k[t,:]/=np.sum(ex_k[t,:])

    # compute expected pairs of hidden states    
    for t in xrange(len(f)-1):
        ex_kk=A*f[t,:][:,np.newaxis]*e[t,:]*b[t+1,:]
        ex_kk/=np.sum(ex_kk)
        nkk+=ex_kk

    # max w/ respect to transition probabilities
    A=pA+nkk
    A/=np.sum(A,1)[:,np.newaxis]

    # solve the weighted regression problem for emissions weights
    #  x and y are from above 
    for k in xrange(K):
        ex=ex_k[:,k][:,np.newaxis]
        dx=np.dot(x.T,ex*x)
        dy=np.dot(x.T,ex*y)
        dy.shape=(2)
        w[k,:]=lin.solve(dx+lam*np.eye(x.shape[1]), dy)

    #return the probability of the sequence (computed by the forward algorithm)
    return fls[-1]

if __name__=='__main__':
    #run the em algorithm
    for i in xrange(20):
        print em_step(x)

    #get rough boundaries by taking the maximum expected hidden state for each position
    r=np.arange(len(ex_k))[np.argmax(ex_k,1)<3]

    #plot
    plt.plot(range(T),x[1:,0])

    yr=[np.min(x[:,0]),np.max(x[:,0])]
    for i in r:
        plt.plot([i,i],yr,'-r')

    plt.show()

0人赞添加讨论(0) 举报

Pattern recognition in time series [closed]

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间