How can I combine multiple sorted files in Python alphabetically?

How can I read multiple input CSV files line by line, compare characters in each line, write the line displayed alphabetically to the output file, and then advance the minimum value file pointer to continue comparing against all files until the end of all input files is reached ... Here's an example of rough planning towards a solution.

buffer = []

for inFile in inFiles:

    f = open(inFile, "r")
    line = f.next()
    buffer.append([line, inFile])

#find minimum value in buffer alphabetically...
#write it to an output file...

#how do I advance one line in the file with the min value?
#and then continue the line-by-line comparisons in input files?

      

+3


source to share


1 answer


You can use heapq.merge

:

import heapq
import contextlib

files = [open(fn) for fn in inFiles]
with contextlib.nested(*files):
    with open('output', 'w') as f:
        f.writelines(heapq.merge(*files))

      



In Python 3.x (3.3+):

import heapq
import contextlib

with contextlib.ExitStack() as stack:
    files = [stack.enter_context(open(fn)) for fn in inFiles]
    with open('output', 'w') as f:
        f.writelines(heapq.merge(*files))

      

+5


source







All Articles