Re.sub only replaces the first two instances
I found this interesting problem with re.sub:
import re
s = "This: is: a: string:"
print re.sub(r'\:', r'_', s, re.IGNORECASE)
>>>> This_ is_ a: string:
Please note that only the first two copies have been replaced. It looks like adding the name [implicit] for the flags fixes the problem.
import re
s = "This: is: a: string:"
print re.sub(r'\:', r'_', s, flags=re.IGNORECASE)
>>>> This_ is_ a_ string_
I was wondering if someone could explain this or if this is actually a bug.
I ran into this problem before with no argument name string
but never for flags
and with a string that usually explodes.
source to share
The fourth argument is re.sub
not flags
, but count
:
>>> import re
>>> help(re.sub)
Help on function sub in module re:
sub(pattern, repl, string, count=0, flags=0)
Return the string obtained by replacing the leftmost
non-overlapping occurrences of the pattern in string by the
replacement repl. repl can be either a string or a callable;
if a string, backslash escapes in it are processed. If it is
a callable, it passed the match object and must return
a replacement string to be used.
>>>
This means you need to do it explicitly flags=re.IGNORECASE
or else it re.IGNORECASE
will be treated as an argument count
.
Also, the flag re.IGNORECASE
is 2
:
>>> re.IGNORECASE
2
>>>
So, by doing count=re.IGNORECASE
in the first example, you said re.sub
to replace only 2
occurrences :
in the string, which was.
source to share