A pythonic way to manipulate the same dictionary

A very naive question. I have the following function:

def vectorize(pos, neg):
    vec = {item_id:1 for item_id in pos}
    for item_id in neg:
        vec[item_id] = 0
    return vec

      

Example:

>>> print vectorize([1, 2] [3, 200, 201, 202])
{1: 1, 2: 1, 3: 0, 200: 0, 201: 0, 202: 0}

      

I feel like this is too verbose in python. Is there a more pythonic way to do this ... Basically, I am returning a dictionary whose values ​​are 1 if its in pos (list) and 0 otherwise?

+3


source to share


4 answers


I'm not sure if this is more pythons ... Maybe a little more efficient? Dunno really

pos = [1, 2, 3, 4]
neg = [5, 6, 7, 8]

def vectorize(pos, neg):
    vec = dict.fromkeys(pos, 1)
    vec.update(dict.fromkeys(neg, 0))
    return vec

print vectorize(pos, neg)

      

Outputs:



{1: 1, 2: 1, 3: 1, 4: 1, 5: 0, 6: 0, 7: 0, 8: 0}

      

But I like your way too ... Just give an idea here.

+3


source


I would just do:

def vectorize(pos, neg):
    vec = {}
    vec.update((item, 1) for item in pos)
    vec.update((item, 0) for item in neg)
    return vec

      



But your code is fine too.

+1


source


you can use

vec = {item_id : 0 if item_id in neg else 1 for item_id in pos}

      

Note that the search item_id in neg

will not be effective if neg

is a list (as opposed to a set).

Update: After seeing the expected result.

Note that the above does not insert 0s for elements that are only in neg

. If you want that too, you can use the next one-liner.

vec = dict([(item_id, 1) for item_id in pos] + [(item_id, 0) for item_id in neg])

      

If you want to avoid creating two temporary lists, itertools.chain

might help.

from itertools import chain
vec = dict(chain(((item_id, 1) for item_id in pos), ((item_id, 0) for item_id in neg)))

      

+1


source


It will be Pythonic, in the sense of being relatively short and making the most of the language features:

def vectorize(pos, neg):
    pos_set = set(pos)
    return {item_id: int(item_id in pos_set) for item_id in set(pos+neg)}

print vectorize([1, 2], [3, 200, 201, 202])

      

+1


source







All Articles