Python 3: split by 3rd delimiter
I need to split the data into a third delimiter including that delimiter in the output.
Code
text = 'sitting on a couch sitting on a chair sitting on a bench' splitText = text.split('sitting')[1] print(splitText)
Result
on a couch sitting on a chair sitting on a bench
Desired result
sitting on a bench
Notes
-
THE DIFFERENCE FUNCTION DOES NOT INCLUDE "SITTING" AS A RESULT WHEN USING IT AS A DEMONSTRATION
-
"seat" SHOULD BE INCLUDED IN THE RESULTS
'sitting' + text.split('sitting')[3]
You can simply divide by space
before sitting
.
x="sitting on a couch sitting on a chair sitting on a bench"
print re.split(r"\s(?=\bsitting\b(?:(?!\bsitting\b).)*$)",x)[1]
Or split on 0 width assertion
, which is not present in the module re
, but is present in the module regex
.
import regex
x="sitting on a couch sitting on a chair sitting on a bench"
print regex.split(r"(?=sitting)",x,flags=regex.VERSION1)[3]
Or use findall
.
x="sitting on a couch sitting on a chair sitting on a bench"
print re.findall(r"(sitting.*?(?=sitting|$))",x)[2]
You can use the following regex:
r'(sitting.*){2}'
This regex will match text containing the word sitting
2 times. Then you can split text
into re.split()
:
>>> text = 'sitting on a couch sitting on a chair sitting on a bench'
>>> import re
>>> re.split(r'(sitting.*){2}',text)
['', 'sitting on a bench', '']
You can get the result with a generator expression and next
:
>>> next(i for i in re.split(r'(sitting.*){2}',text) if i)
'sitting on a bench'
import re
text = 'sitting on a couch sitting on a chair sitting on a bench'
splitText = re.findall('sitting.*?(?= sitting|$)', text)
if len(splitText) >= 3:
print(splitText[2])
Eric: "This is split by regex, not by word by word index." - a "regex" tag was given that you want to use only indices - you can use one of the following patterns:
text = 'sitting on a couch sitting on a chair sitting on a bench' delim = 'sitting' text[text.find(delim, text.find(delim, text.find(delim)+1)+1):]
or
def X(text, delim, n, pos=0):
idx = text.find(delim, pos)
if idx >= 0 and n > 0:
return X(text, delim, n-1, idx+1)
if n > 0:
return ""
if idx > 0:
return text[pos-1:idx]
return text[pos-1:]
text = 'sitting on a couch sitting on a chair sitting on a bench'
print(X(text, 'sitting', 3))