Python script to split a column by flag values ​​and save two files

I have a data file that contains four columns.

test.txt file:

id | addr | cost | flag |

: - |: ----- | ------- | ----- |

300 | 275 | 5 | 0 |

300 | 766 | 15 | 1

300 | 276 | 3 | 1

300 | 248 | 6 | 1

300 | 267 | 11 | 1

508 | 205 | 12 | 0

508 | 201 | 12 | 1

301 | 32 | 3 | 0

301 | 44 | 4 | 1

301 | 32 | 2 | 0

I need to separate the values ​​of the second column from the flag of the fourth column and store them in two separate files.

required output: file: 1

id | adr (F = 0)

300 | 275
508 | 205
301 | 32
File: 2

id | adr (F = 1)

300 | 766
300 | 276
300 | 248
300 | 267
508 | 201
301 | 44

I am very new to python and so far I have done the following.

import sys

if len(sys.argv) < 2:
    sys.stderr.write("Usage: {0} filename\n".format(sys.argv[0]))
    sys.exit()

fn = sys.argv[1]
sys.stderr.write("reading " + fn + "...\n")

# Initialize dictionaries (or hash id)
list_id = {}

fin = open(fn,"r")
for line in fin:
    line = line.rstrip()
    f = line.split("|")
    id = f[0]
    addr = f[1]
    flag = f[3]

fin.close()

      

You need your proposal to complete the program. Thanks in advance for your kind help.

Real glimpse of data:

enter image description here

+3


source to share


2 answers


this is the option using csv

module
:

from csv import reader, writer

with open('test.txt', 'r') as file:
    rows = reader(file, delimiter='|', skipinitialspace=True)
    with open('file1.txt', 'w') as file1, open('file2.txt', 'w') as file2:
        writer1 = writer(file1, delimiter='|')
        writer2 = writer(file2, delimiter='|')
        for row in rows:

            try:
                flag = int(row[3])
            except IndexError:
                # row does has less than 4 elements, next row!
                print('row too short!', row)
                continue
            except ValueError:
                # if this is not an integer, next row!
                print('row[3] not an int!', row[3])
                continue

            if flag == 0:
                writer1.writerow(row[:2])  # write the first 2 entries only
            elif flag == 1:
                writer2.writerow(row[:2])
            else:
                print('flag not in (0, 1)!', flag)

      




for your updated (and different from the original) input, change the reader to

rows = reader(file, delimiter=' ', skipinitialspace=True)

      

must work.

+1


source


Now you only need the if-else.



to_write = id + "\t" + addr

if flag == 0:
    # write out the "to_write" to file number 1
elif flag == 1:
    # write out the "to_write" to file number 2

      

+1


source







All Articles