To save the loop to an array, but skip save items

Basically, I want a fancy oneliner that doesn't read all the files that I scan in memory, but still processes them all and keeps a good sample of them.

The onlineer I would like to do is:

def foo(findex):
    return [bar(line) for line in findex] # but skip every nth term

      

But I would like to be able to not store every nth line in this. that is, I still want it to run (for byte position purposes), but I don't want to save the image because there is not enough memory for that.

So if the output of the line bar (string) is equal 1,2,3,4,5,6,...

, I would like it to still execute on 1,2,3,4,5,6,...

, but I would like the return value to be [1,3,5,7,9,...]

or something like that.

+3


source to share


2 answers


use enumerate

to get the index, and a modulo filter to take every other row:

return [bar(line) for i,line in enumerate(findex) if i%2]

      

Summarize what if i%n

, so every time when the index is divided into n

then i%n==0

and bar(line)

does not appear in listcomp.



enumerate

works for every iterable (file descriptor, generator ...), so it's better than using range(len(findex))

Now the above is not correct if you want to call bar

for all values ​​(because you want the side effect generated bar

) because the filter prevents execution. So you have to do it in 2 passes, for example using map

to apply your function to all elements findex

and select only the results that interest you (but this will ensure that all rows are processed) using the same modulo filter, but after executing :

l = [x for i,x in enumerate(map(bar,findex)) if i%n]

      

+5


source


If findex

is indexable (takes an operator []

with indices), you can try like this:



def foo(findex):
    return [bar(findex[i]) for i in range (0, len(findex), 2) ] 

      

0


source







All Articles