A pythonic way to intersect and add list items at the same time

I have 3 lists a

, b

andc

Each of these lists contains 3-numbered tuples.

Here's an example of input:

a = [(1,2,4),(1,7,8),(1,5,4),(3,6,7)]
b = [(1,2,5),(1,9,3),(1,0,3),(3,6,8)]
c = [(2,6,3),(2,4,9),(2,8,5),(1,2,7)]

      

I am looking for a way to create a list that takes the elements of these 3 lists if the two firsts of each tuple are equal and add a third element.

The data that I have given, there is only 1 set of tuples with the first two values: (1,2,4)

, (1,2,5)

and (1,2,7)

.

If I add my third value I have 4+5+7 = 16

, so with this data I would have to [(1,2,16)]

at the end.

The first two values ​​are unique in each list, [(1,2,7),(1,2,15)]

will not exist.

The problem isn't finding tuples where only the first two values ​​are equal, it's easy to do with a list comprehension. But I am stuck looking for a pythonic way to add the third value at the same time.

I can do it:

elem_list = []
for elem in a:
    b_elem = [i for i in b if i[:-1] == elem[:-1]]
    c_elem = [i for i in c if i[:-1] == elem[:-1]]
    if len(b_elem) != 0 and len(c_elem) != 0:
        elem_list.append((elem[0],elem[1], elem[2]+b_elem[0][2]+c_elem[0][2]))        

      

This will give me the output I want, but it is very long and so I'm pretty sure this is a pythonic way to do it without problems, I just can't figure it out.

+3


source to share


5 answers


Here's one way to do it:

from itertools import product, starmap

def solve(*tups):
    key = tups[0][:2]
    if all(x[:2] == key for x in tups):
        return key + (sum(x[2] for x in tups), )

for p in product(a, b, c):
    out = solve(*p)
    if out:
        print out
        #(1, 2, 16)

      



Or a one-liner using the above function:

print filter(None, starmap(solve, product(a, b, c)))
#[(1, 2, 16)]

      

+3


source


Not very efficient, but will do what you want:



a = [(1,2,4),(1,7,8),(1,5,4),(3,6,7)]
b = [(1,2,5),(1,9,3),(1,0,3),(3,6,8)]
c = [(2,6,3),(2,4,9),(2,8,5),(1,2,7)]
from itertools import product

print(filter(lambda x: x[0][:2] == x[1][:2] == x[2][:2] ,product(a,b,c)))

[((1, 2, 4), (1, 2, 5), (1, 2, 7))]

      

+3


source


Here's one way, without considering any efficiency (it loops i * j * k times, assuming i, j and k are the lengths of your lists a, b, c).

from operator import itemgetter
f = itemgetter(0,1)    
print [(x[0],x[1],x[2]+y[2]+z[2]) for x in a for y in b for z in c if f(x)==f(y)==f(z)]

      

output:

[(1, 2, 16)]

      

+1


source


Use the first two elements of the triplet as a key to the dictionary. Add the third element of the triplet and use that as the dictionary value.

d = {}
# group the tuples and sum 
for x,y,z in a+b+c:
    d[(x,y)] = d.get((x,y), 0) + z

results = []
# sort the keys and convert to a list of tuples
for k in sorted(d.keys()):
    x,y = k
    results.append((x,y,d[(x,y)]))

print results

      

0


source


For good measure, here's a nice boring way that shares locations with your match logic (like the first two components of tuples), conversion logic (like summing up the third component), into plain old helper functions, and then makes a simple recursive call with the boring loop (shortening each time, filtering out inconsistencies) is one way to avoid the wasteful call itertools.product

or starmap

.

from functools import partial
from operator import eq, is_not, itemgetter

a = [(1,2,4),(1,7,8),(1,5,4),(3,6,7)]
b = [(1,2,5),(1,9,3),(1,0,3),(3,6,8)]
c = [(2,6,3),(2,4,9),(2,8,5),(1,2,7)]

is_not_none = partial(is_not, None)

def my_match_criterion(t1, t2):
    return eq(*map(itemgetter(0,1), (t1, t2)))

def my_transformation(t1, t2):
    return t1[0:2] + (t1[2] + t2[2],)

def collapse_matches_with_transformation(tups, *args):
    if args == ():
        return tups
    else:   
        collapsed = collapse_matches_with_transformation(*args)
        for i,c in enumerate(collapsed):
            include = False
            for t in tups:
                if my_match_criterion(t, c):
                    collapsed[i], include = my_transformation(t, c), True
            if not include:
                collapsed[i] = None
        return filter(is_not_none, collapsed) 

print collapse_matches_with_transformation(a, b, c)

      

I probably represent the excited opposite - I believe it is at least as Pythonic as any business to understand. It is becoming too fashionable to use the term "Pythonic" to mean "concise syntax at all costs." This is perpetuated by many people who are so used to looking at single line understanding or built-in functions lambda

serving as key arguments. Artificial ease with which they can read these things, an artifact of simple acquaintance, clouds thinking about whether this method is really more "readable" and, of course, whether it is good from the point of encapsulation.

Of course, if your problem has to be solved only once on one small instance, like when you play in the interpreter, then what works ...

But if you can go back to this code, if there is even a small chance of it, why not write different parts of your requirements into different split functions?

There are several things at work in this task: (1) How long would it take to process the lists of tuples? Is it always only 3? (2) How likely is the match condition? If you suddenly need to add a new piece of data so that your tuples have 4 tuples that correspond to the first three elements, how much code do you need to change and how many places? (3) What if you need to change the transformation? Instead of summing across the 3rd element, what if you need to multiply or sum additional elements?

These considerations should be budgeted for for just about any real problem (read: any place where you use this code more than once).

In any of these cases, all spam code involving the use of a lot of things like lambda x: x[0:2] ... blah

or simply introducing logic x[2] + y[2] + z[2]

into an understanding that returns a result, etc., only gives false brevity, since the system is very fragile wrt assumptions that this is all only 3 lists of 3 tuples whose 3rd components only ever need to be summed under a single matching condition that the first two components match.

Finally, even if you know these things will be fixed, the best way to keep things concise is to change the data structure. For example, if you first convert your lists of tuples to lists of counts, with a subcategory of the first two items as keys, it is very quick to do this:

from collections import Counter

def countify(tups):
    return Counter({t[0:2]:t[2] for t in tups})

a, b, c = map(countify, (a,b,c))
common_keys = set(a.keys()).intersection(b, c)
totals = a + b + c

print [k + (itemgetter(k)(totals),) for k in common_keys]

      

Most people would no doubt say that this second approach is more "Pythonic" - but this is really only true if you don't mind that you sum the values ​​entered into the first two components of the original tuple. The code is not purely generic for more than 3 lists of tuples, it is not robust for small changes in data transformation or presentation.

This is why I think that this kind of short code dictionary should not be synonymous with "Pythonic", which should be much more than what is pragmatic and straightforward. "Simple" does not necessarily mean that "short" and "complex" are not quite correlated with "many lines of code". "Readability" is extremely subjective and highly variable across experience.

Do a boring thing! Write this additional helper function! Write this for-loop! You will be more thrilled that you did it later!

0


source







All Articles