Modified/added: removed some functions, added support for CFWS token, corrected FWSP token, added some boolean flags, added getInternetAddress and extractHeaderAddresses and other methods, some optimization.
Where Mr. Hazlewood's version was more for ensuring certain forms that were passed in during registrations, etc, this handles more types of verifying as well a few forms of extracting the data in predictable, cleaned-up chunks.
Note: CFWS means the "comment folded whitespace" token from 2822, in other words, whitespace and comment text that is enclosed in ()'s.
Limitations: doesn't support nested CFWS (comments within (other) comments), doesn't support mailbox groups except when flat-extracting addresses from headers or when doing verification, doesn't support any of the obs-* tokens. Also: the getInternetAddress and extractHeaderAddresses methods return InternetAddress objects; if the personal name has any quotes or \'s in it at all, the InternetAddress object will always escape the name entirely and put it in quotes, so multiple-token personal names with those characters somewhere in them will always be munged into one big escaped string. This is not really a big deal at all, but I mention it anyway. (And you could get around it by a simple modification to those methods to not use InternetAddress objects.) See the docs of those methods for more info.
Note: This does not do any header-length-checking. There are no such limitations on the email address grammar in 2822, though email headers in general do have length restrictions. So if the return path is 40000 unfolded characters long, but otherwise valid under 2822, this class will pass it.
Examples of passing (2822-valid) addresses, believe it or not:
bob @example.com
"bob" @ example.com
bob (comment) (other comment) @example.com (personal name)
"<bob \" (here) " < (hi there) "bob(the man)smith" (hi) @ (there) example.com (hello) > (again)
(none of which are permitted by javamail, incidentally)
By using getInternetAddress(), you can retrieve an InternetAddress object that, when toString()'ed, would reveal that the parser had converted the above into:
<bob@example.com>
<bob@example.com>
"personal name" <bob@example.com>
"<bob \" (here)" <"bob(the man)smith"@example.com>
(respectively)
If parsing headers, however, you'll probably be calling extractHeaderAddresses().
A future improvement may be to use this class to extract info from corrupted addresses, but for now, it does not permit them.
Some of the configuration booleans allow a bit of tweaking already. The source code can be compiled with these booleans in various states. They are configured to what is probably the most commonly-useful state. @author Les Hazlewood, Casey Connor, Igor Spasic
|
|