Python: writing to CSV in a for-loop, conditionally appending a value on a specific column

Here is an example of the content of my CSV file:

Fruit, colour, ripe,

apple, green,,
banana, yellow,,
pineapple, green,,
plum, purple,,

      

I want to loop through the contents of the CSV file and according to the test (regardless of the CSV data, using the input value fed to the closing function), I ended up with something like this:

Fruit, colour, ripe,

apple, green, true, 
banana, yellow,, 
pineapple, green,, 
plum, purple, true,

      

My current code looks like this:

csv_data = csv.reader(open('./data/fruit_data.csv', 'r'))
for row in csv_data:
    fruit = row[0]
    if fruit == input:
    # Here, write 'true' in the 'ripe' column.

      

It's easy enough to add new data in one go using a CSV module or pandas

, but here I need to add data iteratively. It seems that I can't change the CSV file in place (?), But if I write out another CSV file, it will overwrite every match in the loop, so it will only reflect that value.

+3


source to share


4 answers


You basically have two approaches:

1- Open a second text file before your loop, then loop through each line of the source file and add the lines to the second file. When all lines are complete, close the original file. Example: How do you add a file?

2- Read in everything from the initial csv. Then make changes to the created object (I highly recommend using Pandas for this). Then write to csv. Here's an example of this method:



import pandas as pd
import numpy as np

# read in the csv
csv_data = pd.read_csv('./data/fruit_data.csv')

# I'm partial to the numpy where logic when creating a new column based 
# on if/then logic on an existing column
csv_data['ripe'] = np.where(csv_data['fruit']==input, True, False)

# write out the csv
csv_data.to_csv('./data/outfile.csv')

      

Choosing between 1 and 2 should really go down to scale. If your csv is so large that you cannot read the whole thing and manipulate it the way you want, then you have to stick to it consistently. If you can read all of this and then manipulate it with Pandas, your life will be much easier.

+3


source


To make changes, you must add your new data to a location such as a list. This list will contain the results of your processing.

fruit_details= list()
csv_data = csv.reader(open('./data/fruit_data.csv', 'r'))
for row in csv_data:
    fruit = row[0]
    if fruit == input:
       fruit_details.append([row[0],row[1],'true'])

      



As a result, the list fruit_details will contain fruit with a true value in the "ripe" column. If you add non-fruity elements, add and else, which will either put false or row [2] as needed.

+1


source


If you are creating a temporary file, you can write your lines as you read them. If you are using os.rename

on Unix, renaming will be an atomic operation :

import csv
import os

def update_fruit_data(input):
    csv_file_name = 'data/fruit_data.csv'
    tmp_file_name = "%s.tmp" % csv_file_name
    # Update fruit data
    with open(csv_file_name, 'r') as csv_input_file:
        csv_reader = csv.reader(csv_input_file)
        with open(tmp_file_name, 'w') as csv_output_file:
            csv_writer = csv.writer(csv_output_file)
            for row in csv_reader:
                fruit = row[0]
                if fruit == input:
                    row[2] = 'true'
                csv_writer.writerow(row)
    # Rename tmp file to csv file
    os.rename(tmp_file_name, csv_file_name)

while True:
    input = get_input()
    update_fruit_data(input)

      

get_input

here is the stand-in for what you use to get the value input

.

+1


source


If you want to create a new CSV file

csv_data = csv.reader(open('./Desktop/fruit_data.csv', 'r'))
csv_new = csv.writer(open('./Desktop/fruit_new_data.csv', 'w'))
for row in csv_data:
    fruit = row[0]
    if fruit == input:
        row.append("ripe")
        csv_new.writerow(row)
    else:
        csv_new.writerow(row)

      

Basically the only thing missing from your previous question is the last item to write, otherwise it is added in case the criteria don't match.

Another possibility might be to use linestartswith

+1


source







All Articles