How can I remove all non-letter (all languages) and non-numeric characters from a string?

I've searched for quite a while, but I can't find any explanation on this matter.

If I have a string, say u'àaeëß35+{}"´'

. I want all non-alphanumeric characters to be removed (however I want to keep à, ë, ß

, etc.

I'm new to Python and I couldn't find a regular expression to do this task. The only other solution I can think of is a list with the characters I want to remove and repeat through the string replacing them.

What's the correct Pythonic solution here?

Thank.

+3


source to share


2 answers


In [63]: s = u'àaeëß35+{}"´'

In [64]: print ''.join(c for c in s if c.isalnum())
àaeëß35

      



+9


source


What about:



def StripNonAlpha(s):
    return "".join(c for c in s if c.isalpha())

      

+2


source







All Articles