[KLUG Members] Parsing bad e-mail

Adam Bultman members@kalamazoolinux.org
Wed, 7 Aug 2002 11:45:40 -0400 (EDT)


Okay, to answer my own post in a very poor fashion, and for your info:

http://www.ietf.org/rfc/rfc1893.txt


Looks like 4.x.x and 5.x.x mean failure, while 2.x.x means 'okay'.

My regular expression now looks for: /^[4|5]\d\d\s, which looks for 400
and 500 level error codes. Maybe this will work, but who knows.

Anyway, anyone with preexisting code would be really cool.

Oh, well.  I've matched about 90 percent of what I need, and I'm prety
sure that any bad email address will have the error code in front of it as
well (as matching To: was a little too all- inclusive).



On Wed, 7 Aug 2002, Adam Bultman wrote:

> Greetings, everyone.  I'm writing a perl script that needs to parse
> returned email for the address it was originally sent to.
>
> So, I'm trying to write a perl parser that will run through the file, line
> by line, and find and write down any email addresses it finds (it will be
> coming from one host, obviously).
>
> So far, I've found I can get addresses this way:
>
> /550\s{0,}(\w.*)\.\w.*/  -> Searches for the 550 error
>
> /^To:\s{0,}(\w.*)  -> Searches for the To: stuff, after which I strip out
> the sender, and then like, 'postmaster', since I relaly don't care about
> that.
>
>
> Anyway, what exactly are the conventions for this?  I'm not aware of all
> the error codes for returned mail, although now that I think of it, I
> should search harder than I did.  Is there a catchall for returned mail?
> I just need to catch, and write down, all the bad addresses that land in
> my box here.
>
> (I searched cpan and perlmonks.org for a mail parser, but I didn't find
> something that suited me just right).
>
>
>
>

-- 
Adam Bultman
adam@glaven.org
[ http://www.glaven.org ]