Split list by sequential common item
I have the following list which contains only two characters "N" and "C"
ls = ['N', 'N', 'N', 'C', 'C', 'C', 'C', 'N', 'C', 'C']
What I want to do is extract the consecutive "Cs" and return the index to the list.
Yielding to something like
chunk1 = [('C', 'C', 'C', 'C'), [3,4,5,6]]
chunk2 = [('C', 'C'), [8,9]]
# and when there no C it returns empty list.
How can I achieve this in Python?
I tried this but didn't do as I hoped:
from itertools import groupby
from operator import itemgetter
tmp = (list(g) for k, g in groupby(enumerate(ls), itemgetter(1)) if k == 'C')
zip(*tmp)
source to share
Move zip(*...)
inside list comprehension:
import itertools as IT
import operator
ls = ['N', 'N', 'N', 'C', 'C', 'C', 'C', 'N', 'C', 'C']
[list(zip(*g))[::-1]
for k, g in IT.groupby(enumerate(ls), operator.itemgetter(1))
if k == 'C']
gives
[[('C', 'C', 'C', 'C'), (3, 4, 5, 6)], [('C', 'C'), (8, 9)]]
In Python2, it list(zip(...))
can be replaced with zip(...)
, but since Python3 zip
returns an iterator, we need list(zip(...))
. To make the solution compatible with Python2 and Python3 use list(zip(...))
here.
source to share
Use a generator function. all you have to do is expand group
when unpacking the group. so useyield zip(*group)[::-1]
from itertools import groupby
from operator import itemgetter
def solve(ls):
for key, group in groupby(enumerate(ls), itemgetter(1)):
if key =='C':
yield zip(*group)[::-1]
ls = ['N', 'N', 'N', 'C', 'C', 'C', 'C', 'N', 'C', 'C']
print list(solve(ls))
[[('C', 'C', 'C', 'C'), (3, 4, 5, 6)], [('C', 'C'), (8, 9)]]
source to share
ls = ['N', 'N', 'N', 'C', 'C', 'C', 'C', 'N', 'C', 'C']
def whereMyCharsAt(haystack, needle):
start = None
for ii, char in enumerate(haystack):
if char == needle:
if start is None:
start = ii
else:
if start is not None:
yield [needle] * (ii - start), range(start, ii)
start = None
if start is not None:
yield [needle] * (len(haystack) - start), range(start, len(haystack))
for indexes in whereMyCharsAt(ls, 'C'):
print indexes
Prints:
(['C', 'C', 'C', 'C'], [3, 4, 5, 6])
(['C', 'C'], [8, 9])
source to share