
Introduction
Regular Expressions are very much useful for validation checking. It's not a new technology; it originated in the UNIX environment, and is commonly used with the Perl language. Regular expressions are, however, supported by a number of .NET classes in the namespace System.Text.RegularExpressions.
Its rules are same as the finite automata. Information regarding the main special characters or escape sequences that you can use are available in the MSDN.
Regular Expressions for Email Checking
Basic things to be understood in RegEx are:
- �*� matches 0 or more patterns.
- �?� matches single character.
- �^� for ignoring matches.
- �[]� for searching range patterns.
The rules for validating email IDs, and some valid and invalid examples are mentioned here:
- Email addresses must be start with a letter symbol. And any number of letters or digits or underscore (_) can be appended, and only a single dot (.) is allowed but other symbols and white spaces are not allowed.
- The name field of the address must end with either a letter or digit.
- If underscore or dote is used then before it, letters or digits must be used for a valid name.
- Dot can be used only once but underscore can be used multiple times.
Some examples:
miltoncse00@yahoo.com valid
2milton00@yahoo.com invalid
milton cse00@yahoo.com invalid(white space)
milton_cse@yahoo.com valid
milton_cse_00@yahoo.com valid
milton_cse_00_@yahoo.com invalid(_ before @)
milton.case_00_00@yahoo.com valid
milton.cas.e_00_00@yahoo.com invalid(double dote)
milton.cas.e_00_00@yahoo.co.in valid
milton.cas.e_00_00.@yahoo.co.in valid(dote before @)
miltoncse00@yahoo.com
miltoncse00 name portion
According to these rules and valid examples, we can draw a state diagram for valid name checking of email addresses:

Fig: state diagram for the naming portion
From the state diagram, the regular expression for the naming part is:
[a-z][a-z|0-9|]*([_][a-z|0-9]+)*([.][a-z|0-9]+([_][a-z|0-9]+)*)?
The rules for the email name portion (before @) can start with a letter. And any number of letters or digits can be appended and other symbols are not allowed.
So the regular expression for that part is:
[a-z][a-z|0-9|]*
After the dot (.) portion like (.com/.net), it can start with a letter and any number of letters or digits can be appended. If another dot portion is allowed then that can follow the same rule.
So the regular expression for that part is:
([a-z][a-z|0-9]*(\.[a-z][a-z|0-9]*)?)
Combining all these regular expression, the regular expression for email checking that satisfies the Yahoo! email rules will be:
^[a-z][a-z|0-9|]*([_][a-z|0-9]+)*([.][a-z|0-9]+([_][a-z|0-9]+)*)?
@[a-z][a-z|0-9|]*
\.([a-z][a-z|0-9]*(\.[a-z][a-z|0-9]*)?)$
The C# code that can find that matching is very simple, as illustrated bellow:
string pattern=@"^[a-z][a-z|0-9|]*([_][a-z|0-9]+)*([.][a-z|" +
@"0-9]+([_][a-z|0-9]+)*)?@[a-z][a-z|0-9|]*\.([a-z]" +
@"[a-z|0-9]*(\.[a-z][a-z|0-9]*)?)$";
System.Text.RegularExpressions.Match match =
Regex.Match(txtEmail.Text.Trim(), pattern, RegexOptions.IgnoreCase);
if(match.Success)
MessageBox.Show("Success");
else
MessageBox.Show("Fail");
So, we conclude that any validation problems that involve recursion, option, limitation is easier to solve with regular expressions than using other ways (like if-elseif-else, while condition). This can be represented in a state diagram that is very much easier and efficient to express and use.
My next article will be on auto ID generation for any table using stored procedures.
| You must Sign In to use this message board. |
|
|
 |
|
 |
Hello, Your article is too poor for validating an email id. Here i am showing an example how to validate email id. You can verify email id here:
Verify email id
This example can explain you how you can validate an email id in real world. This is just an example. Not a tutorial. I will write an article upon it later.
Thank you all.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
 |
yes friend I also thought so. I could not vaditate my valid Id even. I even not use my regular expression that I have given here.. I use RFC 2822 standard. In this article I just wanted to show how can we make regular expression from finite automata. I made it simple and more specific to yahoo id rule.. this rule is not for hotmail or gmail.. so how can I use this as a standard and general..
thnx for your comment
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
 |
The above regular expression is not restricted in the length of the character after '@'. For example, This accepts the following email Id,though invalid. dsgdfs.v@sd.comdgdfgdf.dxzfgdsgdfghh
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
I even not use my regular expression that I have given here.. I use RFC 2822 standard. In this article I just wanted to show how can we make regular expression from finite automata. I made it simple and more specific to yahoo id rule.. this rule is not for hotmail or gmail.. so how can I use this as a standard and general..
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
 |
hi
Nice article.But Is there anyway to check whether the EmailID Exists or not(not syntactically..) eg: Speak_to_devil@yahoo.com maybe invalid address(i.e not present on yahoo server) but syntactially perfectly valid....
"Every morning I go through Forbes list of 40 richest people in the world. If my name is not in there, I go to work..!!!"
|
| Sign In·View Thread·PermaLink | 2.25/5 |
|
|
|
 |
|
|
 |
|
 |
hi, thanks for your comment. Here I have provide only validation not verification. Here I only emphasis on valid yahoo email id input. Checking that valid ID exist or not and what we do if ID does not exist is a matter of varification.
" I will know better today then I did yesterday"
|
| Sign In·View Thread·PermaLink | 2.50/5 |
|
|
|
 |
|
 |
Hello Milton, It is funny that you checked only regular expression validation. There are another validation required if you really want to validate an email id. That is domain name validation.
According to you if i use: house@b.com then the mail is valid.
But what if there is no b.com domain exist in real?
That is why you must check the domain name. Check iana.com for a list of all domain.
Or you can get the ip address of domain and ping it.
Not a good article...
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
You can try with this string pattern:
string pattern = @"^(([^<>()[\]\\.,;:\s@\""]+" + @"(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))@" + @"((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}" + @"\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+" + @"[a-zA-Z]{2,}))$";
I'm sure that this code works correctly!!
|
| Sign In·View Thread·PermaLink | 3.50/5 |
|
|
|
 |
|
 |
Although your ideas work for the general cases, your regular expression does not cover the RFC spec fully. Linked below is one site of many that publishes a regular expression for email address validation. As you can see, it is not simple by any means.
http://ex-parrot.com/~pdw/Mail-RFC822-Address.html[^]
Nice idea, but I just want to steer people to this just in case you want 100% conformance.
|
| Sign In·View Thread·PermaLink | 5.00/5 |
|
|
|
 |
|
 |
I disagree, the RFC allows emails which would be invalid if submitted to a mailing list. For instance, if i was sending an email to another machine on my network i need only specify the machine's name: "someone@somebox". while this will work when i send something, it will not work if i want to sign up for a codeproject account.
Saying that this article is "too simple" is too simple in itself. Perhaps an explanation of the RFC further should be included and its corresponding pros/cons. The regular expression derived in this article will cover a good 99.9% of email addresses in use today so perhaps it is a more logical implementation rather than that monolithic one you have pasted.
-- modified at 21:28 Thursday 27th April, 2006
|
| Sign In·View Thread·PermaLink | 5.00/5 |
|
|
|
 |
|
 |
I think maybe somewhere in the middle might be more appropriate. Nothing irritates me more than when sites won't allow the "+" symbol I can use with my Gmail account, even though it's technically valid.
|
| Sign In·View Thread·PermaLink | 5.00/5 |
|
|
|
 |
|
 |
I can't stand it when they only accomodate 3 characters after the .
Not everything is a .com or .org or .net. Some people use the other extensions .info, .tv, .cc whatever
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
|
 |
|
|
 |
|
|