Print the index of changed columns in a list row

The following loop checks the two lists (source and master) for the matched ID (index 0) and then for that row, where the ID is a match, it looks at the changed columns and prints them:

        for row in source:
            identifier = row[0]
            new_cols = row[1:]
            for row in master:
                old_cols = row[1:]
                if identifier == row[0]:
                    print(row[0]) # ID that matched
                    changed_cols = [col for col in new_cols if col not in old_cols] 
                    print(changed_cols) # cols that differ

      

Lists have more than 20 columns per row, so I thought using the string [1:] would be smart, but I'm not sure how to use this method to get the modified index of the column. Thanks for any help.

UPDATE:

source = [['1002', '', '', '', '13RA11', '', 'LO', '4302', '99111', '0', ''], 
['1076', '', '', '', '13RA11', '', 'LO', '4302', '999111', '0', ''], 
['1130', '', '', '', '11HOT1A', '', 'LO', '4302', '99111', '0', '']]

master = [['1002', '', '', '', '13RA11', '', 'LO', '4302', '99111', '0', ''], 
['1076', '', '', '', '13RA11', '', 'LO', '4302', '999111', '1', ''], 
['1130', '', '', '', '13RA11', '', 'LO', '4302', '99111', '1', '']]

      

+3


source to share


3 answers


Do you feel use enumerate

? Your comprehension of the list will change to the following:

changed_cols = [(ID,col) for ID,col in enumerate(new_cols) if col not in old_cols]

      

This seems to be the easiest solution to me.

Let me know if I misunderstood your question and I will work to tweak my solution :)



EDIT: I think you might want something like what Gary suggested:

changed_cols = [(ID,col) for ID,col in enumerate(new_cols) if col != old_cols[ID]]

      

This will only compare the corresponding old column for each new column. I would guess that this is the functionality you would really like. Let me know if you are not sure about the difference :)

+1


source


Try to create a filter and use zip

to collapse each matching row (i.e., combine the rows in source and master that have matching IDs.)

# A generator that returns True/False based on items matching.
def bool_gen(zipped):
    for tup in zipped:
        yield tup[0] == tup[1]

# Use enumerate to store columns as you iterate over the generator.
for enum, item in enumerate(bool_gen(zip(source_row1, master_row1))):
    if (item == True):
        # Print the matching index.
        print(enum)

      

For source_row1 = [1,6,3,8], master_row1 = [5,6,7,8]

this it prints the indices 1 and 3. You can also put the whole thing in a list comprehension, if you like, like this:

changed_cols = [enum for enum, item in enumerate(bool_gen(zip(source_row1, master_row1))) if (item == True)]
# changed_cols returns [1, 3]

      




Putting this suggestion to work for your code:

for row in source:
    identifier = row[0]
    new_cols = row[1:]
    for row in master:
        old_cols = row[1:]
        if identifier == row[0]:
            print(row[0]) # ID that matched
            changed_cols = [enum for enum, item in enumerate(bool_gen(zip(new_cols, old_cols))) if (item == True)]
            print(changed_cols) # cols that differ

      

However, as you can see, this does not reduce the amount of code required or make it more readable. I'm not sure which code would be more efficient.

Let us know if our answers don't match. If yes, please add some more details to your question.

+1


source


You must keep the column number for comparison. If you don't, you will not find an exchange between the two columns. You can do:

for row in source:
    identifier = row[0]
    new_cols = row[1:]
    for row in master:
        if identifier == row[0]:
            old_cols = row[1:]
            print(row[0]) # ID that matched
            n = len(new_cols) if len(new_cols) <= len(old_cols) else len(old_cols)
            changed_cols = [(i, old_cols[i], new_col[i]) for i in range(n) if new_cols[i] != old_cols[i ]] 
            print(changed_cols) # cols that differ
            if len(new_cols) < len(old_cols): print(len(old_cols)-len(new_cols), " cols missing")
            if len(new_cols) > len(old_cols): print(len(new_cols)-len(old_cols), " cols added")

      

+1


source







All Articles