Python pandas dataframe groupby and displays many graphs
Now I have a huge pandas data frame as shown below and the whole data row is 2923922. I want to create multiple rows. The GYEAR range is from 1963 to 1999, and the COUNTRY values ββare US and US. and PATENT is code, CAT is categorical values. I want the x-axis to be GYEAR and the y-axis to be the number of patents and storylines along "Us" / "Non-Us" / Total and other "Other" / "Mechanical" / "Drugs and Medical" storylines like this to do?
GYEAR COUNTRY PATENT CAT
0 1963 Non-US 3070801 Other
1 1963 US 3070802 Other
2 1963 US 3070803 Other
3 1966 US 3070804 Other
4 1966 US 3070805 Other
5 1967 US 3070806 Other
6 1970 US 3070807 Drugs & Medical
7 1970 US 3070808 Drugs & Medical
8 1963 US 3070809 Other
9 1965 US 3070810 Other
10 1965 US 3070811 Other
11 1964 US 3070812 Other
12 1964 US 3070813 Other
13 1964 US 3070814 Mechanical
14 1964 US 3070815 Mechanical
15 1998 US 3070816 Mechanical
16 1998 US 3070817 Mechanical
17 1998 US 3070818 Other
18 1999 US 3070819 Other
I tried these codes but it didn't work. Please give me some advice!
us = df1[(df1['COUNTRY'] == 'US')]
nonus = df1[(df1['COUNTRY'] != 'US')]
plt.plot(us['GYEAR'], us['PATENT'], linewidth='4', color ='k',label='US')
plt.plot(nonus['GYEAR'], nonus['PATENT'], linewidth='1', color ='b',label='Non-US')
source to share
I think you need crosstab
to change with plot
:
pd.crosstab(df['GYEAR'], df['CAT']).plot()
df2 = pd.crosstab(df['GYEAR'], df['COUNTRY'])
df2['Total'] = df2.sum(axis=1)
df2.plot()
Alternative solution with aggregation size
and reshape unstack
:
df.groupby(['GYEAR','CAT']).size().unstack(fill_value=0).plot()
df2 = df.groupby(['GYEAR','COUNTRY']).size().unstack(fill_value=0)
df2['Total'] = df2.sum(axis=1)
df2.plot()
source to share