Bubble plot or Heatmap in matplotlib

2019-05-22 02:53发布

问题:

I am trying to plot a dynamically size able bubble (scatter map). When I try to plot with random data I can very well plot. But when I am trying to parse my input file I am not able to plot.

Input:

Nos,Place,Way,Name,00:00:00,12:00:00
123,London,Air,Apollo,342,972
123,London,Rail,Beta,2352,342
123,Paris,Bus,Beta,545,353
345,Paris,Bus,Rava,652,974
345,Rome,Bus,Rava,2325,56
345,London,Air,Rava,2532,9853
567,Paris,Air,Apollo,545,544
567,Rome,Rail,Apollo,5454,5
876,Japan,Rail,Apollo,644,54
876,Japan,Bus,Beta,45,57

Program:

import pandas as pd
from pandas import DataFrame
import pandas.io.data
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns


df=pd.read_csv('text_2.csv')


#SIZE OF BUBBLES CHANGES

fig = plt.figure()

ax = fig.add_subplot(1,1,1)

ax.scatter(df['Place'],df['Name'], s=df['00:00:00']) # Added third variable income as size of the bubble


plt.show()

I am trying to put Place as x axis and Name as y axis and Size to be taken from the count(00:00). Sizable bubble I could not find much of examples around. Any valuable suggestions is appropriated. Thanks in Advance. Why do I get error at (00:00) column and how do I pass the values of that column ?

Error:

    Traceback (most recent call last):
  File "Bubble_plot.py", line 18, in <module>
    ax.scatter(df['Place'],df['Name'], s=df['00:00:00']) # Added third variable income as size of the bubble
  File "/usr/lib/pymodules/python2.7/matplotlib/axes.py", line 6266, in scatter
    x, y, s, c = cbook.delete_masked_points(x, y, s, c)
  File "/usr/lib/pymodules/python2.7/matplotlib/cbook.py", line 1774, in delete_masked_points
    raise ValueError("First argument must be a sequence")
ValueError: First argument must be a sequence

回答1:

I was hoping this might work by just changing 'Name' and 'Place' to categoricals, but no luck there (with either plot or seaborn). It will basically work if you convert them to integers but then you lose the labels that you'd have with strings or categoricals. FWIW:

df2 = df.copy()
for c in ['Place','Name']:
    df2[c] = df2[c].astype('category').cat.codes

fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.scatter(df2['Place'],df2['Name'], s=df2['00:00:00'])

Or maybe a heatmap would work better? It seems to accept categoricals, so you get the labeling for free.

df3 = df.copy()
for c in ['Place','Name']:
    df3[c] = df3[c].astype('category')

sns.heatmap( df3.pivot_table( index='Place', columns='Name', values='00:00:00' ) )