Why is it not so greedy and only gives me the name of the image?

local s = "http://example.com/image.jpg"
print(string.match(s, "/(.-)%.jpg"))

      

It gives me

--> /example.com/image

      

But I would like to receive

--> image

      

+3


source to share


3 answers


Since the regex engine processes the string from left to right, your pattern found the first /

one and then .-

matched most any characters ( .

) ( -

) up to the first literal .

(match %.

), and then the substring jpg

.

enter image description here

You need to use a negative character class [^/]

(to match any char, but /

), not .

one that matches any character:

local s = "http://example.com/image.jpg"
print(string.match(s, "/([^/]+)%.jpg"))
-- => image

      

See online demo Lua



[^/]

matches any characters but /

, so the latter /

will match the first /

in the pattern "/([^/]+)%.jpg"

. And it will fit

enter image description here

Removing the first /

from the pattern is not a good idea, as it will force the engine to use more redundant steps when trying to find a match, /

will "bind" the quantized subpattern to a character.It /

is easier for the engine to find /

than to search for 0+ (undefined from the beginning) number of characters. different from /

.

If you are sure that this line appears at the end of the line, add $

to the end of the pattern (it's not really clear if you need it, but might be best in general).

+2


source


If you are sure there is a line /

on the line just before the filename, this works:

print(string.match(s, ".*/(.-)%.jpg"))

      



The greedy match .*/

will be stopped at the latter /

, if required.

+3


source


Why is it not so greedy and only gives me the name of the image?

To answer the question directly: .-

does not guarantee the shortest match since the left side of the match is still anchored to the current position and if something is matched at that position it will be returned as the result. Undesirable means that it will consume the least number of characters matched by its pattern until the rest of the pattern has been matched. So using [^/]-

fixes the pattern as it will find the shortest number of characters that are not slashes, and why the use .*/.-

works, since in that case it .*

will eagerly consume everything and then loop back while the rest of the pattern (which in this case would result in something like this same result).

+2


source







All Articles