Regex for email matching

I need a Regex that matches the local @domain email address with the following requirements:

The local part can contain Az, 0-9, period, underscore, and dash.

A domain can contain Az, 0-9, period and dash. The domain must contain at least one dot.

How can I make sure that: A domain cannot start with or end with a period or dash. And that the domain should contain at least one dot?

These are the two things that are really causing me problems when trying to fix this problem.

Have tried the following:

Regex.IsMatch(email, @"(?:[^.-])([\w.-])@([\w.-])(?:[.-]$)");

      

+3


source to share


4 answers


Correct answer:

For the general public, we are ONLY checking for presence @

, the local part is <= 64 characters and it all adds up to <= 254 characters.

Yes, it is good to exclude completely illegal characters. And make sure you are using the last @ symbol and not the first.

You can, if you like, also check if the domain can be found in DNS.

Why am I shouting this in giant type? Because so. Smelly. A lot of sites get this completely and horribly wrong.

Read the following first: I knew how to validate an email address until I read the RFC .

Then reason with me for a moment. Considering as the article says:

  • Aside from completely forbidden characters that carry routing issues, the ONLY determinant of whether an email address is valid is the issuer of that domain owner email address.
  • The ONLY determinant of whether an email address will actually reach someone is to send an email to the address and see if they receive it.

Think about postal mail. Suppose someone gives you a funny address, say this one:

AAB!129 Thor Circle 1/2 atomized Pile$
Armelioborrigenduliamo, GRICKL, θ-niner *
18957382:90347342;21017900~19127734.6
THE MOON

      

Since you've probably never sent mail to the moon before, are you sure you want to judge lunar mailing addresses by the standards of the region you're familiar with? How do you know the address is invalid? What if these people are just doing it weird? If you were a company planning to do business with their clients and make tons of money - why bother you that their address is strange only if the address works?

In fact, this reality, which you cannot validate with another authority address, has been verified by standard business practice in clearing a postal address in the United States: This means that when someone sends you a postal address, you send an API call to the U.S. Postal Service. asking if this is a valid address and furthermore asks for the canonical form. This is because only the post office can tell you if the address is valid . And even then, you don't know if your letter will receive someone until you try to send it!



Why would you assume that someone would give up using a perfectly valid email address known by their email provider to be valid (like sending mail to another country or even another planet) just because it's what format you are not accustomed or what you unknowingly assume is wrong?

If you are just trying to avoid incorrect email addresses due to typos, you can still do it. Show the user: "Hey, something about your address doesn't look quite right. Are you sure it contains these characters you selected ?! # $% ^ & * ()" {} [] `~ Remember if we can't send you an email, you can't create an account. "Then people get a warning, but if they really want it, they can still send it. (Okay, yeah, excluding completely prohibited characters. is not necessarily valid. Look, you should watch it. Indeed, you should not take the words of some random internet user. Know.)

Go ahead and even make it a little painful - make them serve twice. Or check the box and submit a second time. Just don't stop them from using what they want.

I personally have sometimes decided NOT to use websites or services that could not accept email addresses with a plus sign in the local part (before @). And if I just have to have an account, I'll brush my teeth, get a little pissed off, and then post a different address than the one I really want to use.

If you really want to reduce the number of clients you can work with. Then go ahead and be too strict ...

Okay, at this moment you hate me

You think I'm overreacting. You just want to check your email addresses! Is it really that hard? In fact, you will just ignore me and go ahead and write one that will do the job. This is good enough.

Good. If you won't listen to the reason, then here is some regex for you that does it right . (Only, I don't really know if it's doing it right, but I bet this rough sight is closer than anything everyone here is going to come up with in less than days and days of work.)

Magic regex for email validation

(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:
\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(
?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ 
\t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\0
31]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\
](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+
(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:
(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z
|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)
?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\
r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[
 \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)
?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t]
)*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[
 \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*
)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t]
)+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)
*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+
|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r
\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:
\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t
]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031
]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](
?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?
:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?
:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)|(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?
:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?
[ \t]))*"(?:(?:\r\n)?[ \t])*)*:(?:(?:\r\n)?[ \t])*(?:(?:(?:[^()<>@,;:\\".\[\] 
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|
\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>
@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"
(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t]
)*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?
:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[
\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:[^()<>@,;:\\".\[\] \000-
\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(
?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)?[ \t])*(?:@(?:[^()<>@,;
:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([
^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\"
.\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\
]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\
[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\
r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] 
\000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]
|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?(?:[^()<>@,;:\\".\[\] \0
00-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\
.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,
;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|"(?
:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*))*@(?:(?:\r\n)?[ \t])*
(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".
\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t])*(?:[
^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\]
]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(?:\r\n)?[ \t])*)(?:,\s*(
?:(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(
?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[
\["()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t
])*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t
])+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?
:\.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|
\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*|(?:
[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".\[\
]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)*\<(?:(?:\r\n)
?[ \t])*(?:@(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["
()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)
?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>
@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*(?:,@(?:(?:\r\n)?[
 \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,
;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\.(?:(?:\r\n)?[ \t]
)*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\
".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*)*:(?:(?:\r\n)?[ \t])*)?
(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\["()<>@,;:\\".
\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])*)(?:\.(?:(?:
\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z|(?=[\[
"()<>@,;:\\".\[\]]))|"(?:[^\"\r\\]|\\.|(?:(?:\r\n)?[ \t]))*"(?:(?:\r\n)?[ \t])
*))*@(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])
+|\Z|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*)(?:\
.(?:(?:\r\n)?[ \t])*(?:[^()<>@,;:\\".\[\] \000-\031]+(?:(?:(?:\r\n)?[ \t])+|\Z
|(?=[\["()<>@,;:\\".\[\]]))|\[([^\[\]\r\\]|\\.)*\](?:(?:\r\n)?[ \t])*))*\>(?:(
?:\r\n)?[ \t])*))*)?;\s*)

      

Of course it was split into lines. Remove translation strings.

+5


source


In a fit of dark and self-destructive madness, I decided to answer your question.

Your exact requirements:

  • The local part can contain Az, 0-9, period, underscore, and dash.
  • Domain can contain Az, 0-9, period and dash.
  • The domain must contain at least one dot.
  • The domain cannot start with or end with a period or dash.

Your RegEx that meets these exact requirements and no more (case insensitive):

^[\w.-]+@(?=[a-z\d][^.]*\.)[a-z\d.-]*(?<![.-])$

      

Try it at regex101.com

Please vote for this post as much as you have voted for other posts attempting to answer the question as indicated.



Then, see my other answer on this page and vote for it.

Notes:

C # \w

includes a wide range of Unicode characters . This may or may not be what you are looking for. If not, you can leave Regex as it is and use ECMAScript compliant mode. Or you can just change it to a-z0-9_

(inside square brackets). But in \w

short.

\d

also contains some additional numeric characters :

\ d matches any decimal digit. This is equivalent to the \ p {Nd} regular expression pattern, which includes the standard decimal digits 0-9, as well as the decimal digits of a number of other character sets.

You can use ECMAScript compatible mode again, or just change it to 0-9

. But in \d

short.

+1


source


EDIT

I think this will work:

Regex regex = new Regex(@"^([\w\.\-]+)@((?!\.|\-)[\w\-]+)((\.(\w){2,3})+)$");

      

NOTE. Domain names cannot have periods and spaces.

However, instead of using a regular expression, you can try using the Postal Address Class . This way, you don't have to puzzle over understanding another rejax.

public bool IsEmailValid(string address)
{
    try
    {
        MailAddress m = new MailAddress(address);
        return true;
    }
    catch (FormatException)
    {
       return false;
    }
}

      

0


source


You should be learning regex from a source like http://www.regular-expressions.info - a lot of missing knowledge is showing at the moment and the problem has been posed (even ignoring that custom regex is almost certainly the wrong approach, although it might be a useful prefilter).

Why is this regex not working - @"(?:[^.-])([\w.-])@([\w.-])(?:[.-]$)");

I'll explain by breaking the regex into English (which is a great technique for regex in general):

Firstly, all the brackets here have no functional purpose, so I ignore them (see the tutorial for what they mean)

local

  • [^.-]

    - 1 character that is not a dot or dash
  • [\w.-]

    - 1 character, which is alphanumeric or dot or dash

Thus, the local definition is any string that ends with a character string with the above restrictions.

@

- Literal, symbol '@'

domain

  • [\w.-]

    - 1 character, which is alphanumeric or dot or dash
  • [.-]

    - 1 character, which is a dot or dash
  • $

    - End of line.

So the domain definition is a 2 character string with the restrictions above.

This is clearly a far cry from this problem.

What is a regular expression that satisfies given constraints?

Regulations are essentially evaluated left-right in sequence. Express your constraints in a coherent set of descriptions, then translate them into regular expression constructs. I'll do this for well-defined constraints (which I think are not complete). Mentally insert "behind it" between each line.

start of line - ^

- regex method to express the start of line

local - [\w._-]*

- any number (alphanumeric, dot, dash).

@ - @

- Letter character

domain

The key requirement is at least 1 point. This dot will be explicitly present in the regular expression, so think of the domain as {preDot} {dot} {postDot}. For simplicity, define {dot} as the first occurrence .

.

  • \w

    - Single alphanumeric character - this does not start with a dot or dash requirement
  • [\w-]*

    - any number of characters that are alphanumeric or dash
  • \.

    - single (first) dot symbol - this is a special dot must exist
  • (\w*[\.-])*

    - any number (any number of alphanumeric characters followed by a period or dash)
  • [\w-]+

    - 1 or more alphanumeric characters or dashes - this must not end with a period .

end of line - $

- Expression for expressing end of line

And here is the relevant code:

var literal = @"\w*";

var preDot = @"\w[\w-]*";
var dot = @"\.";
var postDot = @"(\w*[\.-])*[\w-]+";
var domain = $"{preDot}{dot}{postDot}";

var email = $"^{literal}@{domain}$";

      

FYI - the regex ends up like ^\w*@\w[\w-]*\.(\w*[\.-])*[\w-]+$

, but that pretty much doesn't matter, it would be terrible to try to understand / save / modify it as one line, whereas a breakout would be possible.

0


source







All Articles