How to smooth a curve with large noise which is on

2019-06-26 16:32发布

I'd like to smooth a scatter plot shown below (the points are very dense), and the data is here.

enter image description here

There is large noise in the middle of the curve, and I'd like to smooth the curve, also the y value should monotonically increase.

Since there are lots of curves like this, it is kind of hard to know where the noise is in the curve.

I tried scipy.signal.savgol_filter, but it didn't work.

The code I used is:

from scipy.signal import savgol_filter
from scipy import interpolate
import numpy as np
import matplotlib.pyplot as plt

s = np.loadtxt('data.csv', delimiter=',')
x = s[:, 0]
y = s[:, 1]
yhat = savgol_filter(y, 551, 3)

plt.plot(x, y, 'r')
plt.plot(x, yhat, 'b')
plt.show()

Suggestions are really appreciated. Thanks!

-------------------update-------------------------

Following Colin's method, I get the results I want. Here is the code:

from scipy.signal import savgol_filter
from scipy import interpolate
import numpy as np
import matplotlib.pyplot as plt

s = np.loadtxt('data.csv', delimiter=',')
x = s[:, 0]
y = s[:, 1]
yhat = savgol_filter(y, 551, 3)

tolerance = 0.2
increased_span = 150
filter_size = 11

first_pass = medfilt(y,filter_size)
diff = (y-first_pass)**2
first = np.argmax(diff>tolerance) - increased_span
last = len(y) - np.argmax(diff[::-1]>tolerance) + increased_span
print (first, last)
#interpolate between increased span
yhat[first:last] = np.interp(x[first:last], [x[first], x[last]],  [y[first], y[last]])


f = interpolate.interp1d(x, yhat, kind='slinear')
x_inter = np.linspace(x[0], x[-1], 1000)
y_inter = f(x_inter)
y_inter = savgol_filter(y_inter, 41, 3)

plt.plot(x, y, 'r')
plt.plot(x, yhat, 'b')
plt.show()

1条回答
做个烂人
2楼-- · 2019-06-26 17:12

If we firstly isolate the trouble area there are many ways to remove it. Here is an example:

tolerance = 0.2
increased_span = 150
filter_size = 11

#find noise
first_pass = medfilt(y,filter_size)
diff = (yhat-first_pass)**2
first = np.argmax(diff>tolerance) - increased_span
last = len(y) - np.argmax(diff[::-1]>tolerance) + increased_span

#interpolate between increased span
yhat[first:last] = np.interp(x[first:last], [x[first], x[last]],  [y[first], y[last]])

enter image description here

查看更多
登录 后发表回答