Validating an Email Address with Regular Expressions

Back in Chapter 7, one of the tasks was validating an email address. To do the job, the script needed to be relatively long. Script 8.1, at its heart, does exactly the same thing as Script 7.15; but by using regular expressions, it takes many fewer lines, and you get a more rigorous result. You'll find the simple HTML in Script 8.2, and the CSS is unchanged from Script 7.6.

Script 8.1. These few lines of JavaScript go a long way to validate email addresses.

window.onload = initForms;

function initForms() {
     for (var i=0; i< document.forms.length; i++) {
         document.forms[i].onsubmit = function() {return validForm();}
     }
}

function validForm() {
     var allGood = true;
     var allTags = document.getElementsByTagName ("*");

     for (var i=0; i<allTags.length; i++) {
        if (!validTag(allTags[i])) {
           allGood = false;
        }
     }
     return allGood;

     function validTag(thisTag) {
        var outClass = "";
        var allClasses = thisTag.className.split (" ");

        for (var j=0; j<allClasses.length; j++) {
           outClass += validBasedOnClass(allClasses[j]) + " ";
        }

        thisTag.className = outClass;

        if (outClass.indexOf("invalid") > -1) {
           invalidLabel(thisTag.parentNode);
           thisTag.focus();
           if (thisTag.nodeName == "INPUT") {
              thisTag.select();
           }
           return false;
        }
           return true;

           function validBasedOnClass(thisClass) {
              var classBack = "";

              switch(thisClass) {
                 case "":
                 case "invalid":
                    break;
                 case "email":
                    if (allGood && !validEmail (thisTag.value)) classBack = "invalid ";
                 default:
                    classBack += thisClass;
              }
              return classBack;
           }

           function validEmail(email) {
              var re = /^\w+([\.-]?\w+)*@\w+  ([\.-]?\w+)*(\.\w{2,3})+$/;

              return re.test(email);
           }

           function invalidLabel(parentTag) {
              if (parentTag.nodeName == "LABEL") {
                 parentTag.className += " invalid";
              }
          }
      }
}

Script 8.2. The HTML for the email validation example.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
     <title>Email Validation</title>
     <link rel="stylesheet" href="script01.css" />
     <script language="Javascript" type="text/ javascript" src="script01.js">
     </script>
</head>
<body>
     <h2 align="center">Email Validation</h2>
     <form action="someAction.cgi">
         <p><label>Email Address:
         <input class="email" type="text" size="50" /></label></p>
         <p><input type="reset" />&nbsp;<input type="submit" value="Submit" /></p>
     </form>
</body>
</html>

Are You Freaking Out Yet?

If this is the first time that you've been exposed to regular expressions, chances are you're feeling a bit intimidated right about now. We've included this chapter here because it makes the most sense to use regular expressions to validate form entries. But the rest of the material in this book doesn't build on this chapter, so if you want to skip on to the next chapter until you've got a bit more scripting experience under your belt, we won't mind a bit.

Tips

This code doesn't match every possible legal variation of email addresses, just the ones that you're likely to want to allow a person to enter.
Note that in Script 8.1, after we assigned the value of re, we used re as an object in step 2. Like any other JavaScript variable, the result of a regular expression can be an object.
Compare the validEmail() functions in Scripts 7.15 and 8.1. The former has 27 lines of code; the latter, only four. They do the same thing, so you can see that the power of regular expressions can save you a lot of coding.
In the script above, someAction.cgi is just an example name for a CGIit's literally "some action"any action that you want it to be. If you want to learn to write CGIs, we recommend Elizabeth Castro's book Perl and CGI for the World Wide Web, Second Edition: Visual QuickStart Guide.

You'll see in Table 8.1 that the special characters (sometimes called meta characters) in regular expressions are case- sensitive. Keep this in mind when debugging scripts that use regular expressions.

Table 8.1. Regular Expression Special Characters
Character
Matches
\
Toggles between literal and special characters; for example, "\w" means the special value of "\w" (see below) instead of the literal "w", but "\$" means to ignore the special value of "$" (see below) and use the "$" character instead
^
Beginning of a string
$
End of a string
*
Zero or more times
+
One or more times
?
Zero or one time
.
Any character except newline
\b
Word boundary
\B
Non-word boundary
\d
Any digit 0 through 9 (same as [0-9])
\D
Any non-digit
\f
Form feed
\n
New line
\r
Carriage return
\s
Any single white space character (same as [ \f\n\r\t\v])
\S
Any single non-white space character
\t
Tab
\v
Vertical tab
\w
Any letter, number, or the underscore (same as [a-zA-Z0-9_])
\W
Any character other than a letter, number, or underscore
\xnn
The ASCII character defined by the hexadecimal number nn
\onn
The ASCII character defined by the octal number nn
\cX
The control character X
[abcde]
A character set that matches any one of the enclosed characters
[^abcde]
A complemented or negated character set; one that does not match any of the enclosed characters
[a-e]
A character set that matches any one in the range of enclosed characters
[\b]
The literal backspace character (different from \b)
{n}
Exactly n occurrences of the previous character
{n,}
At least n occurrences of the previous character
{n,m}
Between n and m occurrences of the previous character
()
A grouping, which is also stored for later use
x|y
Either x or y

There are characters in regular expressions that modify other operators. We've listed them in Table 8.2.
Table 8.2. Regular Expression Modifiers
Modifier
Meaning
g
Search for all possible matches (globally), not just the first
i
Search without case-sensitivity