Creating a new list when a condition with a pair of values ​​and a list is met

The very first question ever here. I've tried to find a solution for about a week now, but I finally have to ask. I am also open to suggestions on the title of this question.

I am using python3

I have a csv file (legend.csv) that contains 2 headers (keys), one for numbers and one for abbreviations.

Each abbr has a corresponding number and this is represented in the csv file.

I also have a list of names (list.txt), the first part of the names is usually an abbreviation of some type.

Program idea: I want to parse a csv file and add a number corresponding to abbr in the names from list.txt. The output should be a new text file if possible.

example of list.txt:
    addg-stuff
    cbdd-stuff
    abc-stuff
    add-stuff

example of legend.csv:
    number,abbr
    0001,addg
    0002,cbdd
    0003,abc
    0004,add


example of desired output:
    0003-abc-stuff
    0001-addg-stuff
    0004-add-stuff
    0002-cbdd-stuff

      

the following finds abbr, but I'm stuck on how to add the corresponding number to the name. Easiest way to cross-reference a CSV file with a text file for common strings

The link above is where I found how to pull the relevant lines, but not sure where to go from here.

   import csv
   with open("legend.csv") as csvfile:
       reader = csv.reader(csvfile)
       searchstring = {row[1] for row in reader}
       num = {row[0] for row in reader}
   with open("list.txt") as txtfile:
       for names in txtfile:
           for i in searchstrings:
               if i in name:
                   matching = (name) #not sure where to go from here. If matching is printed, the names are found that contain the abbr.

      

Definitely new to this, just started messing around with python for a month or so. Any help would be much appreciated, especially if you have good resources for situations like this or python in general.

+3


source to share


2 answers


You can try this:

import csv

f1 = open('legend.csv')
f1 = csv.reader(f1) #splitting at instances of commas
f1 = list(f1) 

f2 = open('list.txt').read().splitlines() #reading every line in the txt file

for i in f2:
   for b in f1[1:]:
       if i.split("-")[0] == b[1]:
          print str(b[0])+"-"+i

      

Output:



0001-addg-stuff
0002-cbdd-stuff
0003-abc-stuff
0004-add-stuff

      

In a double for-loop, the algorithm takes a line from a txt file and then a line from a csv file. Note that f1[1:]

this is sorting the list. This means that we start after the header in the csv file, which doesn't help us in solving the problem. From there, the algorithm tries to determine if the abbreviation is contained in the first part of the string, in which case it is stored as i

. If so, the number and line are printed in the style of the desired output.

+1


source


sets

do not have any implicit ordering. When you create sets, you lose match between the indices. Assuming your acronyms are unique, you can create a list <name : number>

.

lookup = {row[1] : row[0] for row in reader }

      

This also has the added benefit of only reading your csv once.

You can easily check the membership of the dictionary using the keyword in

. Your code for looking up names just becomes:

matches = []
with open("list.txt") as txtfile:
    for name in txtfile:
        if name in lookup:
            matches.append((name, lookup[name])) # this will append (name, num) pairs

      



If you want to condense your code even more, you can use a list comprehension, for example:

with open("list.txt") as txtfile:
    matches = [(name.rstrip(), lookup[name.split('-')[0]]) for name in txtfile if name.split('-')[0] in lookup]

      

Quite printing it gives:

[('addg-stuff', '0001'),
 ('cbdd-stuff', '0002'),
 ('abc-stuff', '0003'),
 ('add-stuff', '0004')]

      

0


source







All Articles