Expand a list while making a list in python

Here is the code:

import itertools, time
x = {1:[1, 2], 2:[2, 3], 3:[3, 4]}
l = [1, 2, 3] * 10000000
start = time.time()
y = [x[i] for i in l]
y = list(itertools.chain.from_iterable(y))
print(y[:6])
print(time.time() - start)

start = time.time()
y = []
[y.extend(x[i]) for i in l]
print(y[:6])
print(time.time() - start)

      

The first approach is to let the inner lists be nested and then flatten them out when the comprehension is complete. The second approach is to create a flat list during comprehension.

The first approach seems to be slightly faster:

3.8479559421539307
4.469805955886841

      

I am wondering if there are better (faster) ways to do this?

+3


source to share


2 answers


The answer is actually hidden in your question. A faster way to achieve what you want:



l = [1, 2, 3]
x = {1:[1, 2], 2:[2, 3], 3:[3, 4]}

y = []
for k in l:
    y.extend(x[k])

y *= 10000000

      

+1


source


Timings closer to my car:

>>> x = {1: [1, 2], 2: [2, 3], 3: [3, 4]}
>>> l = [1, 2, 3] * 10000000
>>> import timeit
>>> import itertools

>>> def by_chaining():
...     y = [x[i] for i in l]
...     return list(itertools.chain.from_iterable(y))
...
>>> timeit.timeit(by_chaining, number=1)
5.491670315985385

>>> def by_extension():
...     y = []
...     [y.extend(x[i]) for i in l]
...     return y
...
>>> timeit.timeit(by_extension, number=1)
5.656423713273028

      

The point is that the approach is .extend

really meant to be used in an explicit loop, because it is a function with side effects. By comprehending a list, you create a separate list full of values None

that are then thrown away, which adds quite a lot of overhead. Cm:

>>> def by_extension_loop():
...     y = []
...     for i in l: y.extend(x[i])
...     return y
...
>>> timeit.timeit(by_extension_loop, number=1)
4.62852763674681

      

Let's try other approaches:

>>> def by_nested_comprehension():
...     return [e for i in l for e in x[i]]
...
>>> timeit.timeit(by_nested_comprehension, number=1)
5.102275806385393

      

Do not do this. In theory, it gets the right answer, but will use excessive memory forever. (This is probably part of the reason why it sum

was explicitly truncated to avoid string summing - although I would like them to redirect it to a more efficient technique).



>>> def by_sum():
...     return sum((x[i] for i in l), [])

      

How about providing a from_iterable

generator to work with?

>>> def by_chaining_generator():
...     return list(itertools.chain.from_iterable(x[i] for i in l))
...
>>> timeit.timeit(by_chaining_generator, number=1)
5.420730297100135

      

How to use map

instead of understanding?

>>> def by_chaining_map():
...     return list(itertools.chain.from_iterable(map(x.get, l)))
...
>>> timeit.timeit(by_chaining_map, number=1)
4.707590696974194

>>> def by_chaining_map_2():
...     return list(itertools.chain.from_iterable(map(x.__getitem__, l)))
...
>>> timeit.timeit(by_chaining_map_2, number=1)
4.576851915207953

      

It looks like it might be a winner, but it's close. It is also possible to optimize the version list.append

by preselecting and pasting slices, IIRC. But my creativity seems to be exhausted at the moment :)

+1


source







All Articles