Find multiple lines between characters
I have a long string of data like this:
category: 33 ; id: AF45DA; category: 54 ; id: KF65YA; category: 60 ; id: XC36IA;
And I would like to create a list from it that looks like this:
new_list = [33,54,60]
Basically I want the values ββbetween category:
and ;
in a string while maintaining the original order.
I could create something that seems to work. I guess there might be exceptions where it won't work correctly. I'm new to Python and don't really know the possibilities, so I would really appreciate if someone could show me how this should be done correctly.
This is the actual version:
s = "category: 33 ; id: AF45DA; category: 54 ; id: KF65YA; category: 60 ; id: XC36IA;"
c = s.count("category")
z = 0
number_list = []
for x in range(z,c):
val = s.split('category:')[x+1]
number = val.split(' ;')[0]
print (number)
number_list.append(number.strip())
print ("All Values:", number_list)
source to share
Just create a regex:
import re
rgx = re.compile(r'category:\s*(\d+)\s*;')
number_list = rgx.findall('category: 33 ; id: AF45DA; category: 54 ; id: KF65YA; category: 60 ; id: XC36IA;')
This gives:
>>> rgx.findall('category: 33 ; id: AF45DA; category: 54 ; id: KF65YA; category: 60 ; id: XC36IA;')
['33', '54', '60']
If you want the result to be int
s, you can use map
:
import re
rgx = re.compile(r'category:\s*(\d+)\s*;')
number_list = list(map(int,rgx.findall('category: 33 ; id: AF45DA; category: 54 ; id: KF65YA; category: 60 ; id: XC36IA;')))
This gives:
>>> number_list
[33, 54, 60]
source to share