Python regex with *?
*
means "matches the previous element as many times as possible (zero or more times)".
*?
means "match the previous item as little as possible (zero or more times)".
The other answers have already addressed this, but what they don't call is how it changes the regex, well, if a flag is re.DOTALL
provided, it makes a huge difference because it .
will match line break characters with this enabled. This .*[^\\]\n
will match the start of the line up to the last newline that is not traced by a backslash (this will match multiple lines).
If the flag is re.DOTALL
not specified, the difference is more subtle, [^\\]
will match everything else except the backslash, including line breaks. Consider the following example:
>>> import re
>>> s = "foo\n\nbar"
>>> re.findall(r'.*?[^\\]\n', s)
['foo\n']
>>> re.findall(r'.*[^\\]\n', s)
['foo\n\n']
So the purpose of this regex is to find nonblank lines that don't end with a backslash, but if you use .*
instead .*?
, you will match additional ones \n
if you have an empty string following a nonblank string.
This is because it .*?
only matches fo
, [^\\]
matches the second o
, and \n
matches at the end of the first line. However .*
will match foo
, [^\\]
will match \n
to complete the first line, and the next \n
will match because the second line is empty.
source to share
Python *?
, :
*?
,+?
,??
:
*
,+
?
; . ; RE<.*>
<H1>title</H1>
, ,<H1>
.?
, - ; ..*?
<H1>
.