Pandas data manipulation in python
I have a data frame df
with columns ID
and N1
, I would like to calculate a column N2
, where the first value of the logic should be N1 for each id, and the next value is 0.888 / 0.999 and so on. And similarly for the next ID. Can we do this WITHOUT using a for loop inpandas
ID N1 N2
1111 0.999 0.999
1111 0.888 0.888888889
1111 0.777 0.875
1111 0.666 0.857142857
1111 0.555 0.833333333
1111 0.444 0.8
1111 0.333 0.75
2222 0.998 0.998
2222 0.887 0.888777555
2222 0.776 0.874859076
2222 0.665 0.856958763
2222 0.554 0.833082707
2222 0.443 0.799638989
2222 0.332 0.749435666
2222 0.221 0.665662651
source to share
Yes, you can use groupby()
, transform()
and shift()
, then fillna(1)
to resolve this first value.
df['N2'] = df.groupby("ID")['N1'].transform(lambda x: x/x.shift(1).fillna(1))
df
ID N1 N2
0 1111 0.999 0.999000
1 1111 0.888 0.888889
2 1111 0.777 0.875000
3 1111 0.666 0.857143
4 1111 0.555 0.833333
5 1111 0.444 0.800000
6 1111 0.333 0.750000
7 2222 0.998 0.998000
8 2222 0.887 0.888778
9 2222 0.776 0.874859
10 2222 0.665 0.856959
11 2222 0.554 0.833083
12 2222 0.443 0.799639
13 2222 0.332 0.749436
14 2222 0.221 0.665663
source to share
This is 1
plus the percentage change
df.assign(N2=df.groupby('ID').N1.pct_change().add(1).fillna(df.N1))
ID N1 N2
0 1111 0.999 0.999000
1 1111 0.888 0.888889
2 1111 0.777 0.875000
3 1111 0.666 0.857143
4 1111 0.555 0.833333
5 1111 0.444 0.800000
6 1111 0.333 0.750000
7 2222 0.998 0.998000
8 2222 0.887 0.888778
9 2222 0.776 0.874859
10 2222 0.665 0.856959
11 2222 0.554 0.833083
12 2222 0.443 0.799639
13 2222 0.332 0.749436
14 2222 0.221 0.665663
source to share