[KLUG Members] Best way to use temp files in PHP

Buist Justin members@kalamazoolinux.org
Mon, 9 Sep 2002 10:19:16 -0400


> -----Original Message-----
> From: adam@morrison-ind.com [mailto:adam@morrison-ind.com]
> Sent: Monday, September 09, 2002 9:56 AM
> To: members@kalamazoolinux.org
> Subject: [KLUG Members] Best way to use temp files in PHP
> 
> 
> I have a PHP object, columnar report.  This lets me define 
> columns (width, data
> types, justification, etc...) and then go about pumping to 
> full of rows.  Once
> it is full I can do things like sort by a column, and then 
> finally dump it to
> some kind of output.  This works great.
> 
> But...
> 
> internall it holds all this information in an array.  So I've 
> got a report with
> 15 columns of various types, and a heck of alot of rows.  At 
> about ~40,000 rows
> I hear "Ahhhhhhhhhhhhhhhhhhhhhhhhhhhh!" as a temporal rift 
> occurs to some hell
> dimension and my PHP process topples into it.  So I'm 
> thinking I just can't have
> arrays that big.

40,000 rows * 15 columns * 256 bytes per column (picked an arbitrary number... sounds fair) / 1,048,576 =~ 146MB.

That's really not -too- much data... and the 256 bytes per column is pretty high.  I figure some will be 4k text fields, other will be 4 byte numeric fields.

> So maybe the answer is I should write all this to a temporary 
> file.  But I'd
> like to still maintain the `intelligence' of an array with 
> its self described
> structure and sort-ability.  If anyone has any tips on the 
> best way to go about
> this I'm all ears (actually, they are still ringing from 
> PHP's cries of agony).

Well, since you have to manipulate the data you're going to have to pull it back into RAM at some point anyway.  So now you've got 100MB+ of data on your disk in a home-cooked format, 100MB of it in RAM and if your box swaps it... well, you've got 100MB of it sitting in swap space now instead of RAM.

Rather than come up with a home-cooked scheme for storing temp data on the disk I'd leave it to the OS to properly swap stuff out to disk when necessary.  If swappage gets too high, add more RAM.

The other thing is you might want to think about a different architecture.  Rather than query all the data out and store it in temporary arrays where you re-sort it over and over again you could possibly just re-querying the DB again for the data and letting it sort things out in the proper order this time... this all depends on the complexity of the query though.  It's generally the way I go though, as I figure the sort routines in the DB are going to be faster and more tested than anything I write.

Justin Buist