Finding the minimum of each column of a CSV file using python
I created a program that will find the minimum of each line of a CSV file and now I would like to do the same for each column, however I was unable to do so. Any advice would be greatly appreciated.
#Import and convert csv
import csv
data = []
with open(file,"r") as f:
reader = csv.reader(f, delimiter=',')
#make sure csv uses "." not "," !!!!!
nump = 0
for row in reader:
floatrow = []
for val in row:
floatrow.append(float(val))
nump += len(floatrow)
data.append(floatrow)
#Calculates minimum of each row, minimum and sum of row
minrr = []
sum1 = 0.0
for row in data:
list2 = (min (filter(None, row)))
minrr.append(list2)
sum1 += sum(row)
+3
source to share
3 answers
The following should work:
with open("data.csv", "r") as f_input:
lmin_col = []
lmin_row = []
for row in csv.reader(f_input):
row = map(float, row)
lmin_row.append(min(row))
if lmin_col:
lmin_col = map(min, lmin_col, row)
else:
lmin_col = row
print "Min per row:", lmin_row
print "Min per col:", lmin_col
As input:
10.1, 15.6, 12.3, 13.2, 17.0
2.1, 5.3, 7.0, 11.4, 5.5
12.1, 7.0, 9.3, 28.7, 1.0
It gives the following output:
Min per row: [10.1, 2.1, 1.0]
Min per col: [2.1, 5.3, 7.0, 11.4, 1.0]
Testing using Python 2.7. An alternative version for Python 3.0 is also available below:
with open("data.csv", "r") as f_input:
lmin_col = []
lmin_row = []
for row in csv.reader(f_input):
row = [float(col) for col in row]
lmin_row.append(min(row))
if lmin_col:
lmin_col = [min(x,y) for x,y in zip(lmin_col, row)]
else:
lmin_col = row
print("Min per row:", lmin_row)
print("Min per col:", lmin_col)
+1
source to share
You can only do this with python builtins and transpose achieved by zip all lines like this:
import csv
a = []
with open('path/to/file.csv',"r") as f:
reader = csv.reader(f, delimiter=',')
for row in reader:
#turn all input to floats
row = map (float, row)
#append the entire row to create list of lists
a.append(row)
# Transpose a into b
b = zip (*a)
# Now min of row will be min of col in a
for line in b:
print min(line)
+2
source to share
I would suggest using np.loadtxt
to read the file as ndarray
and execute np.min
with the given axis:
import numpy as np
arr = np.loadtxt('your_file.csv')
# for each column
minima_c = np.min(arr, axis=0)
# for each row
minima_r = np.min(arr, axis=1)
Here's a little illustration:
In [1]: import numpy as np
In [2]: arr = np.arange(9).reshape((3,3))
In [3]: arr
Out[3]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
In [4]: np.min(arr, 0)
Out[4]: array([0, 1, 2])
In [5]: np.min(arr, 1)
Out[5]: array([0, 3, 6])
+1
source to share