[KLUG Members] Parsing a large file

Tony Gettig members@kalamazoolinux.org
Fri, 6 Feb 2004 15:34:25 -0500


Quoting Andrew Eidson <aeidson@meglink.com>:

> Well.. to lessen the burden it is a simple parse Column1, 2, 3 to file X
> then parse column1, 20,30 to file Y.. the program I am going to import these
> files into will not handle the file as a whole..
> 
> I guess I will have to see about a baby sitter so I can come this tuesday
> (who knows maybe I will get an extention on when it is suppose to be done)
> Though Perl maybe over my head since Python is the first language I am
> currently learning so I am a complete newbie to programming at all.

I will echo Dirk's plug for perl. I had hoped to accomplish a similar task with
commands bundled into a shell script a couple of years ago, but perl ended up
being the best solution and runs on just about anything. I later rewrote it in C
for speed and other reasons, but perl did the trick for quite awhile. Just be
sure you use lots of comments so that when you need to make changes in the
future, you will have given yourself some clues as to what all that code means! :)

Also, you may want to sign up for the KLUG Programming list. It's a great place
to get programming help for things such as you describe. 

http://kalamazoolinux.org/listserv/listserv.php3


Tony Gettig



> 
> -----Original Message-----
> From: members-admin@kalamazoolinux.org
> [mailto:members-admin@kalamazoolinux.org]On Behalf Of Dirk H Bartley
> Sent: Friday, February 06, 2004 3:04 PM
> To: members@kalamazoolinux.org
> Subject: RE: [KLUG Members] Parsing a large file
> 
> 
> There could not be a better opener for a shameless plug.  I am doing a
> presentation on exactly this on Tuesday.
> 
> Perl, Perl, Perl
> 
> I understand that it could be done with python, but  ... .  I have used
> perl to parse very large files.  Files that were part, lot, customer and
> vendor files converting from one business systems
> "field1","field2","field3" info into tab separated that could be input
> into another business system.
> 
> Dirk
> 
> On Fri, 2004-02-06 at 14:32, Andrew Eidson wrote:
> > Hmm.. well guess I am up for some heavy reading then.. I have never used
> > awk.. and since this has to be done by monday I will be reading and
> working
> > through the weekend..
> 
> 
> Did you say Monday as in 3 days from now.  I spent A huge amount of time
> writing the scripts I wrote.  At least 2 Months.  There were a HUGE
> number of contigencies and requirements.  If all you want to do is parse
> from one delimiter to another you could do that in an hour or two.  If
> your task is anything more than trivial, talk to the individual whom is
> requiring this be done by monday and try not to laugh in his
> proverbial.  Most things require extensive testing if it involves
> business critical applications.
> 
> 
> could this be done in Python as easily or not.. I
> > atleast know a little Python were it may be quicker for me to get to the
> end
> > result.
> >
> > -----Original Message-----
> > From: members-admin@kalamazoolinux.org
> > [mailto:members-admin@kalamazoolinux.org]On Behalf Of Adam Williams
> > Sent: Friday, February 06, 2004 2:08 PM
> > To: members@kalamazoolinux.org
> > Subject: Re: [KLUG Members] Parsing a large file
> >
> >
> > >I am trying to parse a rather cumbersome file (355 columns , over 1000
> rows
> > >tab delimited) I have tried importing it into MSSQL but keep getting an
> > >error.. so does anyone know of any scripts that may parse this file into
> 2
> > >or even 3 seperate files??
> >
> > You be able to take it apart with something like awk very easily.
> >
> 
> 
> _______________________________________________
> Members mailing list
> Members@kalamazoolinux.org
> 
> 
> _______________________________________________
> Members mailing list
> Members@kalamazoolinux.org
> 
> 


-- 
Tony Gettig
Voiceovers, PGP key, and more at
http://gettig.net