How to fix rollover with python data

Question

How to fix rollover with python data

Let's say I have this csv file that I want to import and graphics in python using pyplot and pandas.

1,2
2,4
3,3
4,4
5,6
6,3
7,5
8,6
1,3
2,5
3,7
4,4
5,3
6,5
7,4
8,5
1,3
2,2
3,5
4,4
5,3
6,5
7,6
8,7

As you can see, column 1 has rolled over to number 8. How can I get rid of this rollover so that now it looks like this:

1,2
2,4
3,3
4,4
5,6
6,3
7,5
8,6
9,3
10,5
11,7
12,4
13,3
14,5
15,4
16,5
17,3
18,2
19,5
20,4
21,3
22,5
23,6
24,7

I tried a for loop to search through a column and keep track of every time it finds a number that is less than the last ... that should mean slippage! I loop over the entire dataset (which is 95,000 items!) And when I see that the current item is larger than the last one, I multiply it by a counter ... The counter is incremented when it is not, and I add it up to the current record before detecting another rollover.

But, I'm doing something wrong, and I'm not sure if ... My indexes are screwing up at the ends. What is the pythonic way to find this mess?

+3

performance python loops pandas

testname123 May 21 '17 @ 2:15 am

source to share

3 answers

Scott boston · Answer 1 · 2017-05-21T03:37:49+0000

Let's not even import into that first column and let the default range index index for the data block act as your x-axis with pandas df.plot.

from io import StringIO
csv_file = StringIO("""
1,2
2,4
3,3
4,4
5,6
6,3
7,5
8,6
1,3
2,5
3,7
4,4
5,3
6,5
7,4
8,5
1,3
2,2
3,5
4,4
5,3
6,5
7,6
8,7""")

df = pd.read_csv(csv_file, header=None, usecols=[1])

df.plot()

Output:

piRSquared · Answer 2 · 2017-05-21T06:44:38+0000

I wanted to give a mathematical solution ...

read my csv

Then groupby

s cumcount

. Multiply new cumcount

by 8

and add to the first column.

df.a += df.groupby('a').cumcount() * 8

df

     a  b
0    1  2
1    2  4
2    3  3
3    4  4
4    5  6
5    6  3
6    7  5
7    8  6
8    9  3
9   10  5
10  11  7
11  12  4
12  13  3
13  14  5
14  15  4
15  16  5
16  17  3
17  18  2
18  19  5
19  20  4
20  21  3
21  22  5
22  23  6
23  24  7

Jeff saltfist · Answer 3 · 2017-05-21T02:36:06+0000

Your framework's index, created when importing the file via pandas, should already provide a contiguous list of the integers you are looking for. Just put the index on the column.

import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv("Filename.csv")
plt.plot(list(df.index),list(df['column_2']))
plt.show()

How to fix rollover with python data

More articles: