Python pandas dataframe groupby and displays many graphs

Now I have a huge pandas data frame as shown below and the whole data row is 2923922. I want to create multiple rows. The GYEAR range is from 1963 to 1999, and the COUNTRY values ​​are US and US. and PATENT is code, CAT is categorical values. I want the x-axis to be GYEAR and the y-axis to be the number of patents and storylines along "Us" / "Non-Us" / Total and other "Other" / "Mechanical" / "Drugs and Medical" storylines like this to do?

    GYEAR   COUNTRY PATENT  CAT
0   1963    Non-US  3070801 Other
1   1963    US  3070802 Other
2   1963    US  3070803 Other
3   1966    US  3070804 Other
4   1966    US  3070805 Other
5   1967    US  3070806 Other
6   1970    US  3070807 Drugs & Medical
7   1970    US  3070808 Drugs & Medical
8   1963    US  3070809 Other
9   1965    US  3070810 Other
10  1965    US  3070811 Other
11  1964    US  3070812 Other
12  1964    US  3070813 Other
13  1964    US  3070814 Mechanical
14  1964    US  3070815 Mechanical
15  1998    US  3070816 Mechanical
16  1998    US  3070817 Mechanical
17  1998    US  3070818 Other
18  1999    US  3070819 Other 

      

sample 1

sample2

I tried these codes but it didn't work. Please give me some advice!

us = df1[(df1['COUNTRY'] == 'US')]
nonus = df1[(df1['COUNTRY'] != 'US')]

plt.plot(us['GYEAR'], us['PATENT'], linewidth='4', color ='k',label='US')
plt.plot(nonus['GYEAR'], nonus['PATENT'], linewidth='1', color ='b',label='Non-US')

      

+3


source to share


1 answer


I think you need crosstab

to change with plot

:

pd.crosstab(df['GYEAR'], df['CAT']).plot()

      




df2 = pd.crosstab(df['GYEAR'], df['COUNTRY'])
df2['Total'] = df2.sum(axis=1)
df2.plot()

      

Alternative solution with aggregation size

and reshape unstack

:

df.groupby(['GYEAR','CAT']).size().unstack(fill_value=0).plot()


df2 = df.groupby(['GYEAR','COUNTRY']).size().unstack(fill_value=0)
df2['Total'] = df2.sum(axis=1)
df2.plot()

      

+2


source







All Articles