Using Pandas crosstab with naval stacks
I am trying to create a stacked barplot in the sea using my framework.
I first created a crosstab table in pandas like so:
pd.crosstab(df['Period'], df['Mark'])
which returns:
Mark False True
Period BASELINE 583 132
WEEK 12 721 0
WEEK 24 589 132
WEEK 4 721 0
I would like to use the seabed to create a complex comparison chart and this is what I used for the rest of my graphs. I have struggled to do this as I am unable to index the crosstab.
I managed to make the plot I want in pandas using .plot.barh(stacked=True)
but no luck with the seabed. Any ideas how I can do this?
thank
source to share
The guy who made Seaborn doesn't like folding histograms (but this link has a hack that uses Seaborn + Matplotlib to do them anyway).
If you're willing to accept a grouped histogram instead of a stacked one, here's one approach:
# first some sample data
import numpy as np
import pandas as pd
import seaborn as sns
N = 1000
mark = np.random.choice([True,False], N)
periods = np.random.choice(['BASELINE','WEEK 12', 'WEEK 24', 'WEEK 4'], N)
df = pd.DataFrame({'mark':mark,'period':periods})
ct = pd.crosstab(df.period, df.mark)
mark False True
period
BASELINE 118 111
WEEK 12 117 149
WEEK 24 117 130
WEEK 4 127 131
# now stack and reset
stacked = ct.stack().reset_index().rename(columns={0:'value'})
# plot grouped bar chart
sns.barplot(x=stacked.period, y=stacked.value, hue=stacked.mark)
source to share
As you said, you can use pandas to generate a stacked stroke plot. The argument that you want to have a "marine plot" doesn't matter, since every plot on the seabed and every pandas plot is, after all, just matplotlib objects, since the plotting tools of both libraries are just matplotlib camouflage skins ...
So here's the complete solution (taking data from @ andrew_reece's answer).
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
n = 500
mark = np.random.choice([True,False], n)
periods = np.random.choice(['BASELINE','WEEK 12', 'WEEK 24', 'WEEK 4'], n)
df = pd.DataFrame({'mark':mark,'period':periods})
ct = pd.crosstab(df.period, df.mark)
ct.plot.bar(stacked=True)
plt.legend(title='mark')
plt.show()
source to share