Good code is its own best documentation. -- Steve McConnell
Welcome to my blog about software development and the Microsoft stack.

I've been a full time .NET developer for ten years, but I didn't start my professional life as a programmer ... more
Share/Print this page:

Subscribe for news, updates and more:

Parsing Email Addresses with Regular Expressions

A lenient and strict method along with examples

By steve on January 09, 2007.
Updated on January 22, 2012.
Viewed 85,766 times (18 times today).
Article TypesLanguage ElementsLanguagesTechnologiesTechnologiesTopics
SnippetRegular ExpressionsC#.NETEmailPolicy and Standards

Summary

Contents

Email validation is a common task in an ASP.NET page where users need to enter their email addresses. Most of the time a@b.c is an accepted email address, but you might like to do better than that.

The RegularExpressionValidator in .NET 1.1 gives a lenient Regex pattern for parsing an email address. If you don't need the strict pattern use the lenient one. It will stand the test of time better.

Here are the regular expression patterns:

Email Regex from the .NET 1.1 Regular Expression Validator

Contents
string patternLenient = @"\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*";

string patternStrict = @"^(([^<>()[\]\\.,;:\s@\""]+" 
   + @"(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))@" 
   + @"((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" 
   + @"\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+" 
   + @"[a-zA-Z]{2,}))$";

Use the following method to test the regular expressions. Copy the method into the code-behind of an ASPX page with a Label control on it (lblOutput). Don't forget to add the "using" directive to your file: "using System.Text.RegularExpressions".

Test Email Regular Expressions

Contents
public void TestEmailRegex()
{
   string patternLenient = @"\w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*";
   Regex reLenient = new Regex(patternLenient);
   string patternStrict = @"^(([^<>()[\]\\.,;:\s@\""]+" 
      + @"(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))@" 
      + @"((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" 
      + @"\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+" 
      + @"[a-zA-Z]{2,}))$";
   Regex reStrict = new Regex(patternStrict);

   ArrayList samples = new ArrayList();
   samples.Add("joe");
   samples.Add("joe@home");
   samples.Add("a@b.c");
   samples.Add("joe@home.com");
   samples.Add("joe.bob@home.com");
   samples.Add("joe-bob[at]home.com");
   samples.Add("joe@his.home.com");
   samples.Add("joe@his.home.place");
   samples.Add("joe@home.org");
   samples.Add("joe@joebob.name");
   samples.Add("joe.@bob.com");
   samples.Add(".joe@bob.com");
   samples.Add("joe<>bob@bob.come");
   samples.Add("joe&bob@bob.com");
   samples.Add("~joe@bob.com");
   samples.Add("joe$@bob.com");
   samples.Add("joe+bob@bob.com");
   samples.Add("o'reilly@there.com");

   string output = "<table border=1>";
   output += "<tr><td><b>Email</b></td><td><b>Pattern</b>"
      + "</td><td><b>Valid Email?</b></td></tr>";
   bool toggle = true;
   foreach (string sample in samples)
   {
      string bgcol = "white";
      if (toggle)
         bgcol = "gainsboro";
      toggle = !toggle;

      bool isLenientMatch = reLenient.IsMatch(sample);
      if (isLenientMatch)
         output += "<tr bgcolor=" + bgcol + "><td>" 
            + sample + "</td><td>Lenient</td><td>Is Valid</td></tr>";
      else
         output += "<tr bgcolor=" + bgcol + "><td>" 
            + sample + "</td><td>Lenient</td><td>Is NOT Valid</td></tr>";

      bool isStrictMatch = reStrict.IsMatch(sample);
      if (isStrictMatch)
         output += "<tr bgcolor=" + bgcol + "><td>" 
            + sample + "</td><td>Strict</td><td>Is Valid</td></tr>";
      else
         output += "<tr bgcolor=" + bgcol + "><td>" 
            + sample + "</td><td>Strict</td><td>Is NOT Valid</td></tr>";

   }
   output += "</table>";

   lblOutput.Text = output;

}

Below is the output of the test method. Most of the time the lenient and strict patterns agree. But you'll see some cases like "a@b.c" which passes the lenient test and fails the strict test. Determining what characters can be used in an email address is almost more art than science. Basically most ASCII characters are allowed, but not space, <, >, [, ], " and a few others, but in practice many mail servers and email applications have some additional restrictions of their own.

We know that the lenient pattern will often accept mails that are NOT valid, however, I think it may also reject some that ARE valid. For example (joe$@bob.com).

In fact, an @ symbol is not even required for a serviceable email address if you're sticking to your local intranet.

So, really, when you're using a regular expression to validate an email address, you are trying to ensure that you're not going to get flaky, bizzare addresses which, while technically allowed, may be from malicious sources. Afterall, if you're a legitimate user, you're going to be sure your email address is standard and compatible with most systems.

I recently had trouble in a system with a customer having a single quote in their email address. Something like o'reilly@there.com. It's technically correct, but many systems won't allow it.

Output: Email Regex Samples

Contents
EmailPatternValid Email?
joeLenientIs NOT Valid
joeStrictIs NOT Valid
joe@homeLenientIs NOT Valid
joe@homeStrictIs NOT Valid
a@b.cLenientIs Valid
a@b.cStrictIs NOT Valid
joe@home.comLenientIs Valid
joe@home.comStrictIs Valid
joe.bob@home.comLenientIs Valid
joe.bob@home.comStrictIs Valid
joe-bob[at]home.comLenientIs NOT Valid
joe-bob[at]home.comStrictIs NOT Valid
joe@his.home.comLenientIs Valid
joe@his.home.comStrictIs Valid
joe@his.home.placeLenientIs Valid
joe@his.home.placeStrictIs Valid
joe@home.orgLenientIs Valid
joe@home.orgStrictIs Valid
joe@joebob.nameLenientIs Valid
joe@joebob.nameStrictIs Valid
joe.@bob.comLenientIs NOT Valid
joe.@bob.comStrictIs NOT Valid
.joe@bob.comLenientIs Valid
.joe@bob.comStrictIs NOT Valid
joe<>bob@bob.comeLenientIs Valid
joe<>bob@bob.comeStrictIs NOT Valid
joe&bob@bob.comLenientIs Valid
joe&bob@bob.comStrictIs Valid
~joe@bob.comLenientIs Valid
~joe@bob.comStrictIs Valid
joe$@bob.comLenientIs NOT Valid
joe$@bob.comStrictIs Valid
joe+bob@bob.comLenientIs Valid
joe+bob@bob.comStrictIs Valid
o'reilly@there.comLenientIs Valid
o'reilly@there.comStrictIs Valid
Back to Top

User Comments (9)

Posted 2007 Jun 19 09:10 AM. reply
Hello Steve,

Often in my organization we need to extend the validation process of email addresses and check for actual mailboxes existence.
The main concept behind the technique we used is available here:

http://www.codeproject.com/aspnet/Valid_Email_Addresses.asp

Also a few commercial products out there attempts to give a solution to this problem. Here are those we have found out to be the best in the market:

aspnetmx - http://www.aspetmx.com
EmailVerify.NET - http://www.emailverify.net

Mark
Posted 2009 Jan 08 15:39 PM. reply
Thank you for your help in this. I will be using it for a client's website. Neat layout with the testing results. Thanks!

JOIII
Posted 2009 Jun 04 08:44 AM. reply
I modified your lenient regex with capture groups in case anybody is interested. NUnit test case follows...

[Test]
public void TestStandaloneRegex()
{
string email = "joe.bob@home.com";
string lenientPattern = @"(?<userinfo>\w+([-+.]\w+)*)@(?<host>\w+([-.]\w+)*\.\w+([-.]\w+)*)";
Regex regex = new Regex(lenientPattern);
Match m = regex.Match(email);
Assert.IsTrue(m.Success, string.Format(
"Expected a valid email address: '{0}'",
email));
if (!m.Success)
throw new ApplicationException(string.Format(
"String is not a valid email address: '{0}'",
email));
string userinfo = m.Groups["userinfo"].ToString();
string host = m.Groups["host"].ToString();
Assert.AreEqual("joe.bob", userinfo);
Assert.AreEqual("home.com", host);
}

Joe Herr
Posted 2010 Jun 09 19:31 PM. reply
How do I edit the formula to catch a stray character at the end. For example:

joe@badtyping.co,

Bryan
Posted 2011 Jul 16 20:56 PM. reply
thank you.

ns
Posted 2011 Jul 31 20:18 PM. reply
I had a task to do a strict regex for email address and wrote my own. I then found yours and you helped me solve a few problems.

Mine ended up a lot shorter of a regular expression and I was wondering if you had any unit tests you used to see if there are any issues with my shorter strict version.

http://www.rhyous.com/2010/06/15/regular-expressions-in-cincluding-a-new-comprehensive-email-pattern/

Jared
Posted 2012 Jan 07 15:06 PM. reply
Great BUT: The following valid emails do not validate. much."more\ unusual"@example.com very.unusual."@".unusual.com@example.com very."(),:;<>[]".VERY."very@\\\ \"very".unusual@strange.example.com The following invalid emails do validate josé@example.com

Mike
Posted 2012 Feb 27 04:10 AM. reply
Mike, you have far too much time

Simon
Post Your Comment
  You may post without logging in or login here.
Display Name: Required.
Email: Required. Will not be shown. Used for identicon.
Comment:
Allowed tags: <quote></quote>, <code></code>, <b></b>, <i></i>, <u></u>, <red></red>
 
   Please type text as shown in the image at left.