This graph is generated by the following gnuplot
script. The estimated.csv
file is found in this link: https://drive.google.com/open?id=0B2Iv8dfU4fTUaGRWMm9jWnBUbzg
# ###### GNU Plot
set style data lines
set terminal postscript eps enhanced color "Times" 20
set output "cubic33_cwd_estimated.eps"
set title "Estimated signal"
set style line 99 linetype 1 linecolor rgb "#999999" lw 2
#set border 1 back ls 11
set key right top
set key box linestyle 50
set key width -2
set xrange [0:10]
set key spacing 1.2
#set nokey
set grid xtics ytics mytics
#set size 2
#set size ratio 0.4
#show timestamp
set xlabel "Time [Seconds]"
set ylabel "Segments"
set style line 1 lc rgb "#ff0000" lt 1 pi 0 pt 4 lw 4 ps 0
# Congestion control send window
plot "estimated.csv" using ($1):2 with lines title "Estimated";
I wanted to find the pattern of the estimated signal of the previous plot something close to the following plot. My ground truth (actual signal is shown in the following plot)
Here is my initial approach
#!/usr/bin/env python
import sys
import numpy as np
from shapely.geometry import LineString
#-------------------------------------------------------------------------------
def load_data(fname):
return LineString(np.genfromtxt(fname, delimiter = ','))
#-------------------------------------------------------------------------------
lines = list(map(load_data, sys.argv[1:]))
for g in lines[0].intersection(lines[1]):
if g.geom_type != 'Point':
continue
print('%f,%f' % (g.x, g.y))
Then invoke this python script in my gnuplot
directly as in the following:
set terminal pngcairo
set output 'fig.png'
set datafile separator comma
set yr [0:700]
set xr [0:10]
set xtics 0,2,10
set ytics 0,100,700
set grid
set xlabel "Time [seconds]"
set ylabel "Segments"
plot \
'estimated.csv' w l lc rgb 'dark-blue' t 'Estimated', \
'actual.csv' w l lc rgb 'green' t 'Actual', \
'<python filter.py estimated.csv actual.csv' w p lc rgb 'red' ps 0.5 pt 7 t ''
which gives us the following plot. But this does not seem to give me the right pattern as gnuplot is not the best tool for such tasks.
Is there any way where we can find the pattern of the first graph (estimated.csv
) by forming the peaks into a plot using python? If we see from the end, the pattern actually seems to be visible. Any help would be appreciated.
I think
pandas.rolling_max()
is the right approach here. We are loading the data into a DataFrame and calculate the rolling maximum over 8500 values. Afterwards the curves look similar. You may test with the parameter a little bit to optimize the result.To answer the question from the comments:
Since
pd.rolling()
is generating defined windows of your data, the first values will beNaN
forpd.rolling().max
. To replace theseNaN
s, I suggest to turn around the whole Series and to calculate the windows backwards. Afterwards, we can replace all theNaN
s by the values from the backwards calculation. I adjusted the window length for the backwards calculation. Otherwise we get erroneous data.This code works:
And we get the following result: