Parsing selectable columns and rows using Python
I have xls type files that have 9 columns and different number of rows. I would like to use xlrd or other modules so that I can extract all values from 1st and 2nd columns from nine columns and then use the extracted values separately. So far, my code looks like this:
import xlrd
import openpyxl
workbook = xlrd.open_workbook('C09.xls')
sheet_names = workbook.sheet_names()
sheet = workbook.sheet_by_name(sheet_names[0])
num_rows = sheet.nrows
num_cols = sheet.ncols
plist = [[0 for x in range(3)] for x in range(num_rows)]
for i in range(num_rows):
for j in range(3):
plist[i][j] = sheet.cell(i,j).value
and then use the values in [i] (for example, do multiplications and whatnot), and then pull the corresponding values from [j].
The above code gives the result, for example:
[['Col header 1', 'Col header 2', 'Col header 3'], [1.0, 1000, 2000], [2.0, 1001, 2001], ..... so on]
Is there an easy way to do this? I'm new to Python, so I would appreciate it if you were more specific. Thank you very much!
source to share
Some feedback / improvements:
In your snippet, some of the initialization is redundant. This avoids double iteration:
plist = [[sheet.cell(i,j).value for j in range(3)] for i in xrange(num_rows)]
if you are using values None
, they can be normalized with:
plist = [[sheet.cell(i,j).value or 0 for j in range(3)] for i in xrange(num_rows)]
Finally, here's a more pythonic way of doing 0-initializations:
plist = [x[:] for x in [[0] * 3] * sheet.nrows
source to share