Pandas Series - Recording Numeric Changes
I have a panel dataframe
with many observations of people's location data over 10 years. It looks something like this:
personid location_1991 location_1992 location_1993 location_1994
0 111 1 1 2 2
1 233 3 3 4 999
2 332 1 3 3 3
3 454 2 2 2 2
4 567 2 1 1 1
I want to track each person's transitions by creating a variable for each transition type. I would like the column to be checked whenever a person navigates to each type of location. Ideally it would look like this:
personid transition_to_1 transition_to_2 transition_to_3 transition_to_4
0 111 0 1 0 0
1 233 0 0 0 1
2 332 0 0 1 0
3 454 0 0 0 0
4 567 1 0 0 0
So far, I've tried to iterate over each line and then loop through each element in the line to check if it's the same as the previous one. It seems intense. Is there a better way to keep track of the changing values on each line of my frame?
+3
source to share
1 answer
I did some combination of first stacking these columns and then unfolding along them.
df = pd.DataFrame(pd.read_clipboard()) df2 = pd.DataFrame(df.set_index('personid').stack(), columns=['location']) df2.reset_index(inplace=True) df2.reset_index(inplace=True) df3 = df2.pivot(index='index', columns='location', values='personid') df3 = df3.fillna(0)
So far it looks like this:
location 1 2 3 4 999
index
0 111 0 0 0 0
1 111 0 0 0 0
2 0 111 0 0 0
3 0 111 0 0 0
4 0 0 233 0 0
5 0 0 233 0 0
6 0 0 0 233 0
7 0 0 0 0 233
8 332 0 0 0 0
9 0 0 332 0 0
10 0 0 332 0 0
11 0 0 332 0 0
12 0 454 0 0 0
13 0 454 0 0 0
14 0 454 0 0 0
15 0 454 0 0 0
16 0 567 0 0 0
17 567 0 0 0 0
18 567 0 0 0 0
19 567 0 0 0 0
df3['personid'] = df3.max(axis=0, skipna=True)
df3 = df3.set_index('personid', drop=True)
df3[df3 > 0] = 1
And here it is:
location 1 2 3 4 999
personid
111 1 0 0 0 0
567 1 0 0 0 0
567 0 1 0 0 0
332 0 1 0 0 0
233 0 0 1 0 0
233 0 0 1 0 0
233 0 0 0 1 0
233 0 0 0 0 1
332 1 0 0 0 0
332 0 0 1 0 0
332 0 0 1 0 0
332 0 0 1 0 0
454 0 1 0 0 0
454 0 1 0 0 0
454 0 1 0 0 0
454 0 1 0 0 0
567 0 1 0 0 0
567 1 0 0 0 0
567 1 0 0 0 0
567 1 0 0 0 0
+2
source to share