Convert CSV to txt and start a new line every 10 values using Python
I have a csv file with an array of values 324 rows and 495 columns. All values for each row and column are the same.
I need to split this array so that every 10 values are placed on a new line. So for each of the 324 rows, there will be 49 full columns with 10 values and 1 column with 5 values (495 columns / 10 values = 49 new rows with 10 values and 1 new row with 5 values). Then go to the next line and so on for 324 lines.
The problem I am facing is listed below:
- line.split (",") doesn't seem to do anything
- Everything after line.split doesn't seem to do anything.
- I'm not sure if mine is for a newborn in range ... correct
- I haven't added an entry to the text file yet, I think it should be outFile.write (something goes, not sure what)
- i put "\ n" after printing, but it just printed it
I am a beginner programmer.
Script:
import string
import sys
# open csv file...in read mode
inFile= open("CSVFile", 'r')
outFile= open("TextFile.txt", 'w')
for line in inFile:
elmCellSize = line.split(",")
for newrow in range(0, len(elmCellSize)):
if (newrow/10) == int(newrow/10):
print elmCellSize[0:10]
outFile.close()
inFile.close()
source to share
You should really be using the csv module, but I can give you some advice.
You have one problem: when you say print elmCellSize[0:10]
you always take the first 10 elements, not the very last 10 elements. Depending on how you want to do this, you can save a string to remember the last 10 items. I'll provide an example below, mentioning a few things that you can fix with your code.
First, notice what the line.split(',')
list returns. Therefore, your choice of the variable name is a elmCellSize
little misleading. If you said lineList = line.split(',')
it could make more sense? Or if you say lineSize = len(line.split(','))
and use it?
Also (although I don't know anything about Python 2.x) I think there xrange
is a function for Python 2.x that is more efficient than range
even though it works exactly the same.
Instead of saying if (newrow/10) == int(newrow/10)
, you can say if index % 10 == 0
to check if an index is a multiple of 10. %
can be thought of as a "remainder", so it will give the remainder newrow
when divided by 10
. (Example: 5% 10 = 5; 17% 10 = 7; 30% 10 = 0)
Now, instead of printing [0:10]
, which will always print the first 10 elements, you want to print back 10 spaces from the current index. So you can tell print lineList[index-10:index]
to print the most recent 10 items.
You end up with something like
...
lineList = line.split(',') # Really, you should use csv reader
# Open the file to write to
with open('yourfile.ext', 'w') as f:
# iterate through the line
for index, value in enumerate(lineList):
if index % 10 == 0 and index != 0:
# Write the last 10 values to the file, separated by commas
f.write(','.join(lineList[index-10:index]))
# new line
f.write('\n')
# print
print lineList[index-10:index]
I'm not an expert of course, but hope this helps!
source to share
Ok, this script almost works I guess.
The problem is that it stops writing to outFile after the first 49 lines. It creates 10 columns for 49 rows, but there should be a 50th row with 5 columns because each row from the CSV file is 495 columns. So the current script writes the last 10 values to a new line 49 times, but doesn't get those additional 5. It also has to do it 323 more times since the original CSV file has 324 lines.
So I think the problem is now, perhaps in the latter case if, perhaps an else statement is required, but my elif statement did nothing. I want it to say that if the 6th value in the list is an end-of-line character ('\ n') then write 5 values in the prioir list to the end of the line ... it didn't work.
Thanks for all the help so far, I appreciate it!
Here is the script:
import string
#import sys
#import csv
# open csv file...in read mode
inFile= open("CSVFile.csv", 'r')
outFile= open("TextFile.txt", 'w')
for line in inFile:
lineList = line.split(',') # Really, you should use csv reader
# Open the file to write to
with open('outFile', 'w') as outFile:
# iterate through the line
for index, value in enumerate(lineList):
if index % 10 == 0 and index != 0:
# Write the last 10 values to the file, separated by space
outFile.write('\t'.join(lineList[index-10:index]))
# new line
outFile.write('\n')
# print
print lineList[index-10:index]
elif lineList[6] == '\n':
# Write the last 5 values to the file, separated by space
outFile.write(' '.join(lineList[index-5:index]))
# new line
outFile.write('\n')
# print
print lineList[index-:index]
outFile.close()
inFile.close()
source to share