Removing values โ€‹โ€‹from a list of tuples

I have a list of tuples that I would like to return only the second column of data and only unique values

mytuple = [('Andrew','Andrew@gmail.com','20'),('Jim',"Jim@gmail.com",'12'),("Sarah","Sarah@gmail.com",'43'),("Jim","Jim@gmail.com",'15'),("Andrew","Andrew@gmail.com",'56')]

      

Desired output:

['Andrew@gmail.com','Jim@gmail.com','Sarah@gmail.com']

      

My idea would be to iterate over the list and add the item from the second column to the new list, then use the following code. Before I take this path too far, I know there is a better way to do it.

from collections import Counter
cnt = Counter(mytuple_new)
unique_mytuple_new = [k for k, v in cnt.iteritems() if v > 1]

      

+3


source to share


6 answers


You can use the function zip

:

>>> set(zip(*mytuple)[1])
set(['Sarah@gmail.com', 'Jim@gmail.com', 'Andrew@gmail.com'])

      

Or, as a less efficient way, you can use map

and operator.itemgetter

and use set

to get a unique tuple:

>>> from operator import itemgetter
>>> tuple(set(map(lambda x:itemgetter(1)(x),mytuple)))
('Sarah@gmail.com', 'Jim@gmail.com', 'Andrew@gmail.com')

      

comparative analysis of some answers:

my answer:

s = """\
mytuple = [('Andrew','Andrew@gmail.com','20'),('Jim',"Jim@gmail.com",'12'),("Sarah","Sarah@gmail.com",'43'),("Jim","Jim@gmail.com",'15'),("Andrew","Andrew@gmail.com",'56')]
set(zip(*mytuple)[1])
"""
print timeit.timeit(stmt=s, number=100000)
0.0740020275116

      




Icodez answer:

s = """\
mytuple = [('Andrew','Andrew@gmail.com','20'),('Jim',"Jim@gmail.com",'12'),("Sarah","Sarah@gmail.com",'43'),("Jim","Jim@gmail.com",'15'),("Andrew","Andrew@gmail.com",'56')]
seen = set()
[x[1] for x in mytuple if x[1] not in seen and not seen.add(x[1])]
"""
print timeit.timeit(stmt=s, number=100000)
0.0938332080841

      




Hasan's answer:

s = """\
mytuple = [('Andrew','Andrew@gmail.com','20'),('Jim',"Jim@gmail.com",'12'),("Sarah","Sarah@gmail.com",'43'),("Jim","Jim@gmail.com",'15'),("Andrew","Andrew@gmail.com",'56')]
set([k[1] for k in mytuple])
"""
print timeit.timeit(stmt=s, number=100000)
0.0699651241302

      




Adem's answer:

s = """
from itertools import izip
mytuple = [('Andrew','Andrew@gmail.com','20'),('Jim',"Jim@gmail.com",'12'),("Sarah","Sarah@gmail.com",'43'),("Jim","Jim@gmail.com",'15'),("Andrew","Andrew@gmail.com",'56')]
set(map(lambda x: x[1], mytuple))
"""
print timeit.timeit(stmt=s, number=100000)
0.237300872803 !!!

      

+3


source


try:



>>> unique_mytuple_new = set([k[1] for k in mytuple])
>>> unique_mytuple_new
set(['Sarah@gmail.com', 'Jim@gmail.com', 'Andrew@gmail.com'])

      

+1


source


unique_emails = set(item[1] for item in mytuple)

      

Understanding the list will help you generate a list that contains only the data of the second column, and converting that list to set()

removes duplicate values.

+1


source


You can use a list comprehension and set to keep track of the values โ€‹โ€‹seen:

>>> mytuple = [('Andrew','Andrew@gmail.com','20'),('Jim',"Jim@gmail.com",'12'),("Sarah","Sarah@gmail.com",'43'),("Jim","Jim@gmail.com",'15'),("Andrew","Andrew@gmail.com",'56')]
>>> seen = set()
>>> [x[1] for x in mytuple if x[1] not in seen and not seen.add(x[1])]
['Andrew@gmail.com', 'Jim@gmail.com', 'Sarah@gmail.com']
>>>

      

The most important part of this solution is that the order is preserved, as in your example. Doing it alone set(x[1] for x in mytuple)

or something similar will give you unique items, but their order will be lost.

Also, it if x[1] not in seen and not seen.add(x[1])

might sound a little odd, but it's actually a neat trick that lets you add items to a set inside a list comprehension (otherwise we need to use a for-loop).

Since it and

does short-circuit evaluation in Python, not seen.add(x[1])

will only evaluate if it x[1] not in seen

returns True

. So the condition sees if it x[1]

is in the set and adds it if not.

The statement not

is placed before seen.add(x[1])

, so that the condition is evaluated as True

if x[1]

necessary to be added to the set ( set.add

returns None

that is treated as False

. not False

True

).

+1


source


How about an obvious and simple loop? No need to create a list and then convert to a set, just don't add duplicates.

mytuple = [('Andrew','Andrew@gmail.com','20'),('Jim',"Jim@gmail.com",'12'),("Sarah","Sarah@gmail.com",'43'),("Jim","Jim@gmail.com",'15'),("Andrew","Andrew@gmail.com",'56')]
result = []
for item in mytuple:
    if item[1] not in result:
        result.append(item[1]) 
print result

      

Output:

['Andrew@gmail.com', 'Jim@gmail.com', 'Sarah@gmail.com']

      

0


source


Is the order of the items important? Many of the suggested answers use a set

unique list. It is good, correct and fulfilled if the order is not important. If order matters, you can use OrderedDict

to perform set-like unique-ification while maintaining order.

# test data
mytuple = [('Andrew','Andrew@gmail.com','20'),('Jim',"Jim@gmail.com",'12'),("Sarah","Sarah@gmail.com",'43'),("Jim","Jim@gmail.com",'15'),("Andrew","Andrew@gmail.com",'56')]

from collections import OrderedDict
emails = list(OrderedDict((t[1], 1) for t in mytuple).keys())
print emails

      

Yielding:

['Andrew@gmail.com', 'Jim@gmail.com', 'Sarah@gmail.com']

      

Update

Based on the suggestion from iCodez, repeat the answer to:

from collections import OrderedDict
emails = list(OrderedDict.fromkeys(t[1] for t in mytuple).keys())

      

0


source







All Articles