I am trying to plot a seaborn swarmplot where col[2] is the freq and col[3] are the classes to be grouped by. Input is given below and the code too. Input
tweetcricscore,51,high active
tweetcricscore,46,event based
tweetcricscore,12,event based
tweetcricscore,46,event based
tweetcricscore,1,viewers
tweetcricscore,178,viewers
tweetcricscore,46,situational
tweetcricscore,23,situational
tweetcricscore,1,situational
tweetcricscore,8,situational
tweetcricscore,56,situational
Code:
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style="whitegrid", color_codes=True)
df = pd.read_csv('input.csv', header = None)
df.columns = ['keyword','freq','class']
ax = sns.swarmplot(x="class", y="freq", data=df)
plt.show()
The code wouldn't plot nor would give any error. Any suggestion to optimize the code ?
I think you need first
read_csv
, then create new column class by concanecate withfillna
and laststrip
whitespaces:Solution if column
class
not contains whitespaces:EDIT2:
If separator is
,
use:After several trails in plotting
swamplot
with the dataset of more than8-10k
rows and with constant help and suggestions by jezreal. We come to conclusion thatseaborn
category plottingswarmplot
cannot scale the large data like the other plots fromseaborn
which is also mentioned in the tutorial document. Hence I change the plotting style tobokeh
scatter plot where I use the numeric values ony
axis and grouped category names onx
axis and this kinda solved my problem of plottingunivariate
data plotting with a category.This allowing grouping by
class
column with allocating colors and markers according to the groups. Thefreq
is plotted along the y axis.Note: This might have accidentally worked as the data is discrete is nature.