Split a string into a list of selections based selectively on certain commas within the string

Question

Split a string into a list of selections based selectively on certain commas within the string

I have a long Python string of the form:

string='Black<5,4>, Black<9,4>'

How can I split this string and any other arbitrary length that is of the same kind (i.e. <ArbitraryString1<ArbitraryListOfIntegers1>,<ArbitraryString2<ArbitraryListOfIntegers2>,...

) into a list of tuples.

For example the following desired output from string

:

list_of_tuples=[('Black',[5,4]),'Black,[9,4])

I usually use string.split

commas to create a list and then a regex to separate the word from <>

, but since I need to use commas to delimit my indices (content <>

), this doesn't work.

+3

python string regex

CiaranWelsh May 22 '17 at 12:46

source to share

3 answers

you can do the non-enclosed comma separation <,>

manually and then process the details later:

string = 'Black<5,4>, Black<9,4>'

chunks = []
s = string + ','
N = len(s)
pos, level = 0, 0
for i in range(0, N):
    if s[i] == '<':
        level += 1

    elif s[i] == '>':
        level -= 1

    elif s[i] == ',':
        if level == 0:
            chunks.append(s[pos:i])
            pos = i+1

print(chunks)

+2

ewcz May 22 '17 at 12:52

source to share

You can split by ", "

(note the spaces) and then execute the data.

Sample code:

string='Black<5,4>, Black<9,4>'

splitted_string = string.split(', ')

list_of_tuples = []
for s in splitted_string:
  d = s.replace("<", " <").split()

  color = d[0]
  n1 = d[1].replace("<", "").replace(">","").split(",")[0]
  n2 = d[1].replace("<", "").replace(">","").split(",")[1]

  t = (d[0], [n1, n2])
  list_of_tuples.append(t)

print(list_of_tuples)

Output:

[('Black', ['5', '4']), ('Black', ['9', '4'])]

+2

dot.Py May 22 '17 at 12:56

source to share

Wiktor Stribiżew · Accepted Answer · 2017-05-22T12:51:26+0000

You can use a regex to capture 1+ word characters before <

and capture everything inside <...>

into another group, and then split the content of group 2 with ,

, casting the values to int:

import re
s='Black<5,4>, Black<9,4>'
print([(x, map(int, y.split(','))) for x,y in re.findall(r'(\w+)<([^<>]+)>', s)])
# => [('Black', [5, 4]), ('Black', [9, 4])]

See Python demo

Template details :

(\w+)

- group 1 (assigned x

): 1 or more word characters
<

- literal <
([^<>]+)

- Group 2 (assigned to y

): 1+ characters other than <

and>
>

- letter >

.

Split a string into a list of selections based selectively on certain commas within the string

More articles: