Splitting a string into substrings in Lua
I am trying to split a string into substrings using Lua. Using the pattern in the for loop below, I would expect 4 matches, but I only get 2.
print(words[1])
displays
"### Lorem ipsum dolor sit amet, Gruß consetetur sadipscing elitr, sed diam unumy eirmod time invidunt ut labore et dolore magna aliquyam erat, sed diam \ n"
and print(words[2])
displayed
"### In vero eos et accusam et justo duo dolores et ea rebum. Stet clita \ nkasd gubergren, no sea takimata Gruß sanctus est \ n"
Can anyone explain this behavior to me?
i=0
content = "###Lorem ipsum dolor sit amet, Gruß consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam \n ###voluptua. ###At vero eos et accusam et justo duo dolores et ea rebum. Stet clita \nkasd gubergren, no sea takimata Gruß sanctus est \n###XLorem ipsum dolor sit amet. Lorem ipsum \ndolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor \ninvidunt ut labore et Gruß dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.sdl"
for word in string.gmatch(content, '###')
do i = i+1 end
if(i>1) then
content = content .. '###'
else end
words= {}
for y in string.gmatch(content,"(###.-)###")
do
table.insert(words, y)
end
print(words[3])
source to share
This is a simplified version of your second loop:
content = '###aa###bb###cc###dd###'
words= {}
for y in string.gmatch(content,"(###.-)###") do
print(y)
table.insert(words, y)
end
Output:
###aa
###cc
The problem is that with the pattern the (###.-)###
second is also consumed ###
. What you need is like looking at a regex (###.+?)(?=###)
. Unfortunately the Lua template does not support lookahead. This is one of the possible ways:
local left = content
local start = 1
while true do
start, index, match = string.find(left, "(###.-)###")
if not start then break end
print(match)
left = left:sub(index - 3) --3 is the length of "###"
end
source to share