Getting dense region boundary values from output o

2019-07-31 22:27发布

问题:

This question is an extension/alternate method at trying to solve this question (Extracting boundaries of dense regions of 1s in a huge list of 1s and 0s)

I have a huge list of 1s and 0s [Total length = 53820].

Example of how the list looks like - [0,1,1,1,1,1,1,1,1,0,0,0,1,1,0,0,0,0,0,0,1,1...........]

The visualization is given below.

x-axis: index of the element (from 0 to 53820)

y-axis: value at that index (i.e. 1 or 0)

The black lines around the peaks (indicating 1), are the dense regions whose boundary values (start and end values on x-axis) that I want to extract.

Input Plot--> (http://i67.tinypic.com/2h5jq5e.png)

I have done a Kernel Density estimation of the indexes (x-axis) and the list of 1s and 0s (y-axis) and I obtain a plot like this:

KDE Plot--> (https://i.stack.imgur.com/psvp2.png)

I am interested in extracting the values of the x-axis (index number) on y-axis = 1. i.e. I would like to extract the boundaries (start and end values) where 1s are most dense (Indicated by black lines on the plot)

The code for the same is a follows:

    path='./'+csv_path+'.csv'
    df = pd.read_csv(path)
    mylist = df['Values'] # The huge list with 1s and 0s
    indexes = list(range(len(mylist))) # The corresponding list of indexes of mylist
    ax = sns.jointplot(x=indexes, y=mylist, kind="kde")
    plt.show()

I printed out the axes which sns.distplot returns for plotting, which gives me <seaborn.axisgrid.JointGrid object at 0x7f09d37d5cf8>

But I am unable to extract the boundary values from that.

I also looked at (https://matplotlib.org/api/collections_api.html) but could not figure out how to use it for my problem here.

Any help would be appreciated :)