Split string between characters using python regex

I am trying to split the line:

> s = Ladegårdsvej 8B7100 Vejle

      

with regex in:

[street,zip,city] = ["Ladegårdsvej 8B", "7100", "Vejle"]

      

s

varies a lot, the only definite part is that zip always has 4 digits and spaces. So my idea is to "match right" by 4 digits and a space to indicate that the string should be split by that point in the string.

Currently I can get street

and city

like this:

> print re.split(re.compile(r"[0-9]{4}\s"), s)
["Ladegårdsvej 8B", "Vejle"]

      

How would I go for splitting s

at will; in particular, how do I do it in the middle of the line between the number in street

and zip

?

+3


source to share


3 answers


You can use re.split

, but make four digits a capture group:

>>> s = "Ladegårdsvej 8B7100 Vejle"
>>> re.split(r"(\d{4}) ", s)
['Ladegårdsvej 8B', '7100', 'Vejle']

      



From the documentation (emphasis mine)

Split string for presence of patterns. If parentheses are used in the pattern, then the text of all groups in the pattern is also returned as part of the resulting list. If maxsplit is nonzero, maxsplit splits at most, and the rest of the string is returned as the final element of the list.

+8


source


Once you have a street, getting the zip is trivial:



zip = s[len(street):len(street)+4]

      

+1


source


Here is the solution to the problem.

# -*- coding: utf-8 -*-
import re
st="Ladegårdsvej 8B7100 Vejle"
reg=r'([0-9]{4})'
rep=re.split(reg,st)
print rep

      

Solution for other test cases provided by RasmusP_963 sir.

# -*- coding: utf-8 -*-
import re
st="Birkevej 8371900 Roskilde"
print re.split(r"([0-9]{4}) ",st)

      

0


source







All Articles