Python regex doesn't detect square brackets

I have a script where I want to remove all special characters except spaces from a given content and I am working with Python and I have used this regex

re.sub(r"[^a-zA-z0-9 ]+","",content)

      

Itt removed all special characters, but didn't remove square brackets [ ]

and I just don't know why this is happening

after that i just use this regex

content = re.sub(r"[^a-zA-z0-9 ]+|\[|\]","",content)

      

It works flawlessly in IDLE IDE

and removes all special characters, but when I want to replace large files like the Wikipedia page then its now not remove the closing square brackets ]

I just don't understand why Python

this strange behavior does and

+3


source to share


1 answer


You have a lowercase z

where it should be upppercase. Change:

re.sub(r"[^a-zA-z0-9 ]+","",content)

      



in

re.sub(r"[^a-zA-z0-9 ]+","",content)

      




For recording range 'A-z'

extended to characters A...Z

, [

, \

, ]

, ^

, _

, ``

, A...Z

; why is your regex stripping everything but those characters.

ASCII table:

enter image description here

+10


source







All Articles