Python 3 iterate over a dictionary of a list efficiently

Question

Python 3 iterate over a dictionary of a list efficiently

I have working code that classifies data based on rules inside a list dictionary. I want to know if the code can be made more efficient by getting rid of nested loops using jump / dictionary methods or .values ().

import pandas as pd


df=pd.DataFrame({'Animals': [ 'Python', 'Anaconda', 'Viper', 'Cardinal',
                 'Trout', 'Robin', 'Bass', 'Salmon', 'Turkey', 'Chicken'],
                 'Noise': ['Hiss','SSS','Hisss','Chirp','Splash','Chirp', 
                 'Gulp','Splash','Gobble','Cluck'],
                 })


snakenoise =['Hiss','SSS','Hisss', 'Wissss', 'tseee']
birdnoise =['Chirp', 'squeak', 'Cluck', 'Gobble']
fishnoise =['Splash', 'Gulp', 'Swim']


AnimalDex = {'Snake':['0', 'slither',snakenoise],
              'Bird':['2','fly', birdnoise],
              'Fish':['0','swim',fishnoise],
              }

df['movement'] = ''

for key, value in AnimalDex.items():
    for i in range(len(AnimalDex[key][2])):
        df.loc[df.Noise.str.contains(AnimalDex[key][2][i]),'movement'] = AnimalDex[key][1]

print (df)

Here is the output

    Animals   Noise movement
0    Python    Hiss  slither
1  Anaconda     SSS  slither
2     Viper   Hisss  slither
3  Cardinal   Chirp      fly
4     Trout  Splash     swim
5     Robin   Chirp      fly
6      Bass    Gulp     swim
7    Salmon  Splash     swim
8    Turkey  Gobble      fly
9   Chicken   Cluck      fly

+3

python dictionary iteration

ccsv Apr 28 15 at 10:05

source to share

3 answers

user124757 · Answer 1 · 2015-04-28T10:44:10+0000

If you are just using values instead of keys and indices, you can simplify your loop.

for animal in AnimalDex.values():
    for value in animal[2]:
        df.loc[df.Noise.str.contains(value),'movement'] = animal[1]

200_success · Answer 2 · 2015-04-28T12:22:58+0000

Efficiency does not come from rewriting the loop as a comprehension, as comprehensions generally provide better syntax for loops. Rather, the efficiency of the data structure lookup is important. The problem is what df.Noise.str.contains(AnimalDex[key][2][i])

brute force matching is doing.

If your goal is to combine the movements defined in AnimalDex

in df

by entering according to the noise, then it pays to build a dictionary that displays the noise of the movement:

noise_to_movement = {}
for order in AnimalDex.values():
    for noise in order[2]:
        noise_to_movement[noise] = order[1]

For comparison, here's another way to build noise_to_movement

using obscure considerations:

import itertools

noise_to_movement = dict(itertools.chain(*[list(
    itertools.product(order[2], [order[1]])) for order in AnimalDex.values()
]))

In any case, once the dictionary is built, setting the column 'movement'

becomes a trivial search:

df['movement'] = list(noise_to_movement[n] for n in df.Noise)

paulo.filip3 · Answer 3 · 2015-04-28T12:29:28+0000

To really improve performance, you shouldn't be iterating through a dictionary. Instead, make pandas.DataFrame

from this data and join two DataFrames.

import pandas as pd

df = pd.DataFrame({'Animals': [ 'Python', 'Anaconda', 'Viper',   'Cardinal',
                   'Trout', 'Robin', 'Bass', 'Salmon', 'Turkey', 'Chicken'],
                   'Noise': ['Hiss','SSS','Hisss','Chirp','Splash','Chirp', 
                   'Gulp','Splash','Gobble','Cluck']})

snakenoise =['Hiss','SSS','Hisss', 'Wissss', 'tseee']
birdnoise =['Chirp', 'squeak', 'Cluck', 'Gobble']
fishnoise =['Splash', 'Gulp', 'Swim']

noises = [(snakenoise, 'Snake', '0', 'slither'),
          (birdnoise, 'Bird', '2', 'fly'),
          (fishnoise, 'Fish', '0', 'swim')]

animal_dex = {'Animal Type': [],
              'Whatever': [],
              'Movement': [],
              'Noise': []}

for noise in noises:
    animal_dex['Noise'] += noise[0]
    animal_dex['Animal Type'] += map(lambda x: noise[1], noise[0])
    animal_dex['Whatever'] += map(lambda x: noise[2], noise[0])
    animal_dex['Movement'] += map(lambda x: noise[3], noise[0])

df1 = pd.DataFrame(animal_dex)

df = df.merge(df1, on='Noise')
df
    Animals   Noise Animal Type Movement Whatever
0    Python    Hiss       Snake  slither        0
1  Anaconda     SSS       Snake  slither        0
2     Viper   Hisss       Snake  slither        0
3  Cardinal   Chirp        Bird      fly        2
4     Robin   Chirp        Bird      fly        2
5     Trout  Splash        Fish     swim        0
6    Salmon  Splash        Fish     swim        0
7      Bass    Gulp        Fish     swim        0
8    Turkey  Gobble        Bird      fly        2
9   Chicken   Cluck        Bird      fly        2

Python 3 iterate over a dictionary of a list efficiently

More articles: