I am trying to plot a cumulative histogram using Plotly in python, but make it look like "steps", i.e. bars with no color and only the top line is displayed. Something like this:
Basically, I'm trying to reproduce the behavior of the following matplotlib code:
import matplotlib.pyplot as plt
plt.hist(x, cumulative=True, histtype='step')
So far, the best I've been able to do is:
import plotly.graph_objs as go
from plotly.offline import iplot
h = go.Histogram(x=x,
cumulative=dict(enabled=True),
marker=dict(color="rgba(0,0,0,0)",
line=dict(color="red", width=1)))
iplot([h])
Which results in something like:
So what's the trick?
If you're willing to handle the binning and accumulation before you plot the data, you can use a go.Scatter
object with the shape property of the line set to 'hvh'
.
Plot:
Code: Setup for a Jupyter Notebook
#imports
import plotly.plotly as py
import plotly.graph_objs as go
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import numpy as np
import pandas as pd
# qtconsole for debugging
#%qtconsole -- style vim
# Notebook settings
init_notebook_mode(connected=True)
# Some sample data
x = np.random.normal(50, 5, 500)
binned = np.histogram(x, bins=25, density=True)
plot_y = np.cumsum(binned[0])
# Line
trace1 = go.Scatter(
x=binned[1],
y=plot_y,
mode='lines',
name="X",
hoverinfo='all',
line=dict(color = 'rgb(1255, 0, 0)', shape='hvh'
)
)
data = [trace1]
# Layout
layout = dict(title = 'Binned data from normal distribution',
legend=dict(
y=0.5,
traceorder='reversed',
font=dict(
size=16
)
)
)
# Make figure
fig = dict(data=data, layout=layout)
# Plot
iplot(fig, filename='line-shapes')
I hope this is something you can use!
Don't hesitate to let me know if not.
Some details:
The data sample is made using np.random.normal()
. x
is a sampled normal distribution with mean = 50, sigma = 5 and 500 observations. x
is then put in 50 bins using np.histogram()
which returns two arrays. These are used as data source for the plot.
Possible alternative approaches:
I also tried using your snippet with some random sample data and include shape='hvh'
in your line=dict(color="red", width=1)
. That did not seem to work though. I also considered modifying the layout of your go.Histogram()
so that only the top line of the bars were plotted, but I don't think it's possible.
The accepted solution works but may be limiting as the bins are all of equal width. One approach is to use matplotlib to calculate stats, then plot with plotly:
# sample data
# I am not using a normal distribution on purpose so that the effect of varying bin widths is apparent.
x = np.random.rand(100)
# use matplotlib to get "n" and "bins"
# n_bins will affect the resolution of the cumilative histogram but not dictate the bin widths.
n_bins = 100
n, bins, patches = plt.hist(x, n_bins, density=True, histtype='step', cumulative=-1)
# use plotly (v3) to plot
data = []
trace = go.Scatter(
x=bins,
y=n,
mode='lines',
name= "test",
line=dict(
shape='hvh'
)
)
data.append(trace)
fig = go.Figure(data=data)
iplot(fig)
The result should look something like this: