RegEx to match end of line

I am looking for a match for email addresses in a text document for which I am writing a regex. I came up with something like this for beginners -

((?:[a-zA-Z]+[\w+\.\-]+[\-a-zA-Z]+))[ ]*((?:@|at))[ ]*(?:[a-zA-Z\.]+)

      

I want to make sure the end of the email address is "edu" or "com". How should I do it? I am using Python.

Some examples of email addresses from my text document

alice @ so.edu
alice at sm.so.edu
alice @ sm.com

      

Edit -

I only want to change this regex. My regex matches other examples in my data.

+3


source to share


2 answers


((?:[a-zA-Z]+[\w+\.\-]+[\-a-zA-Z]+))[ ]*((?:@|at))[ ]*(?:[a-zA-Z\.]+)\.(com|edu)

      

EDIT : for "dot" instead of ".":



((?:[a-zA-Z]+[\w+\.\-]+[\-a-zA-Z]+))[ ]*((?:@|at))[ ]*(?:[a-zA-Z\.]+) *(\.|dot) *(com|edu)

      

+2


source


First of all, see this answer for an explanation of how to match all valid email addresses as per RFC822.

I personally would not change the regexp, but instead use email.Utils.parseaddr()

regexp for matches and check that the resulting string is .endswith("edu")

or .endswith("com")

. For example.



>>> email.Utils.parseaddr("kimvais@mailinator.com")[1].endswith(".com")
True

      

+1


source







All Articles