How can I remove all non-letter (all languages) and non-numeric characters from a string?
I've searched for quite a while, but I can't find any explanation on this matter.
If I have a string, say u'àaeëß35+{}"´'
. I want all non-alphanumeric characters to be removed (however I want to keep à, ë, ß
, etc.
I'm new to Python and I couldn't find a regular expression to do this task. The only other solution I can think of is a list with the characters I want to remove and repeat through the string replacing them.
What's the correct Pythonic solution here?
Thank.
+3
Phil
source
to share
2 answers
In [63]: s = u'àaeëß35+{}"´'
In [64]: print ''.join(c for c in s if c.isalnum())
àaeëß35
+9
root
source
to share
What about:
def StripNonAlpha(s):
return "".join(c for c in s if c.isalpha())
+2
rodrigo
source
to share