Python csv file cell values comparison
I have the following dataset in a CSV file
[1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 1, 1, 2]
Now I want to count each value by comparing them and storing them in an array, but I don't want the frequency. So my conclusion should be like this:
[3, 4, 3, 2, 1]
My code looks like this:
import csv
with open("c:/Users/Niels/Desktop/test.csv", 'rb') as f:
reader = csv.reader(f, delimiter=';')
data = []
for column in reader:
data.append(column[0])
results = data
results = [int(i) for i in results]
print results
dataFiltered = []
for i in results:
if i == (i+1):
counter = counter + 1
dataFiltered.append(counter)
counter = 0
print dataFiltered
My idea was to compare cell values. I know there is something wrong in the results loop, but I cannot figure out where my error is. My idea was to compare cell values. maybe
source to share
I won't go into the details of your cycle, which is very wrong, it if i==(i+1):
just can't be True
for starters.
Next, you'd be better off itertools.groupby
and sum up the lengths of the groups:
import itertools
results = [1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 1, 1, 2]
freq = [len(list(v)) for _,v in itertools.groupby(results)]
print(freq)
len(list(v))
uses list
to force iterate over the grouped elements so that we can calculate the length ( sum(1 for x in v)
it might be more performant / appropriate, I didn't use both approaches)
I get:
[3, 4, 3, 2, 1]
Other than that: reading the first column of the csv file and converting the result to an integer can simply be obtained:
results = [int(row[0]) for row in reader]
source to share