[KLUG Members] Standard Regular Expressions (REs)

Adam Williams members@kalamazoolinux.org
Mon, 24 Mar 2003 09:51:30 -0500 (EST)


>social security number
>\d{3}-\d{2}-\d{4}
>phone number
>\(\d{3}\) (\d{3}-\d{4})
>e-mail
>[\w\.\-]+@[\w\.\-]+

Your e-mail regex is too restrictive (at least as I'm interpreting it).  
E-mail addresses may contain a plus sign to the right of the address.

adam+presentations.ldavp3@morrison-ind.com as an example.

>But notice, one of these messy characters, [~!@#$%^&*()_+=],

Are underscores permitted the the left of the "@"?  Recent testing has 
reveals that most MTAs on the internet seem the barf on them.  

To the right of the "@" can be essentially ANY BINARY VALUE.  The days of 
ASCII only DNS is gone. (See RFC2181).  At least by mid-2004 there will be 
more eastern internet users than there will be in the west, so it is better 
to be prepared for this.

>may also be "allowed" in an e-mail address, so the REs just listed could 
>be better.
>P.S.  What are the "less frightening" differences between REs in Java 
>and Perl?

regex should be the same everywhere.