[KLUG Members] ext3 size limits

members@kalamazoolinux.org members@kalamazoolinux.org
Fri, 4 Oct 2002 10:28:39 -0400


>>>Would I have a problem using a 1TB external SCSI RAID array under Linux 
>>>with an EXT3 filesystem?
>>With 2.4.x you have a maximum *FILE* size of ~16 terabytes.  
>>Maximum *FILESYSTEM* size with ext2/3 without tweaking anything is ~4
>>terabytes.  This assumes a 4k block size, and on Intel 32-bit I think
>>you are limited to page size no larger than PAGE_CACHE_SIZE, which is 4k
>>on most processors.  Still, your 1TB is well within the 4TB limit.
>>*BUT* what are you going to be storing in this filesystem?  If it is
>Well, We have a 590GB drive now. There are two major data streams we 
>archive and process. One are weather products from NOAA( Aviation weather, 
>radar images, satellite images), which range in size from 100K to 10s of 
>Megs on the order of 5G per day. 
>The second stream is ASDI data from the air traffic control NAS computers 
>on the order of a 400M file each day.
>We also do extensive post processing of this data to produce files on the 
>order of 120M. 
>In theory I could make a couple partitions, one for the larger ASDI files 
>and one for the smaller weather data. I was hoping to not have to draw the 
>line on how much to allocate to each.

Use LVM,  then you can draw a wiggly easily movable line.
 
>Does this help paint a picture? Can someone please help make some tough 
>decisions to plan this. I've never worked with partitions this large 
>before and I would hate to fill the filesystem before the drive is full.
>How many inodes should I figure for this size? 1 for every 50M??

For the filesystem with large files I think the default should be fine.  For 
the one with the radar images (small files) I'd do a little math and figure out 
what you think the max would be ((10 files a day * 365 days a year * n years) + 
15%) Something like that.

>I have other concerns about backups and fragmentation. The current thread 
>on defrag is a bit worrisome at the moment. If this filesystem is going to 
>get slower with time and would be extremely difficult to do a full backup 
>and format then there are greater issues. I was hoping to choose a highly
>redundant and fault tolerant media since the data is so volatile it is 
>difficult to backup.

Sure.  Again LVM snapshotting may be the only realistic way to back this stuff 
up.  I've been in this situation many times.  If you use XFS you 
can "xfs_freeze -f {fileystem}" to supsend activity to the filesystem.  Then do 
a "lvcreate -L 200 -s --name snapshot {filesystem's lv}" to create the 
snapshot, then do a "xfs_freeze -u {filesystem}" to unfreeze the filesystem.  
Backup the snapshot volume, then drop the snapshot with.  You can do roughly 
the same think with ext2/3 only it doesn't have the nice freeze utility so 
creating the snapshot may take awhile.  Both xfs and ext2/3 can be grown.

If you use ext2/3 make your directory stucture deep as it gets pretty slow with 
thousands of files in the same directory.

>>Do you have a separate physical volume (or partition on the RAID device)
>>for the journal?
>Good point. I just created a journel on the existing 590G RAID and mounted 
>it ext3, it was ext2. This seemed like a good option since the new RH7.3 
>system I moved it to had ext3 partitions. It takes 45 minutes to fsck it!

Yep, fsck is ugly.

>I haven't had a chance to read up on journeling yet. I assume from the 
>name that any writes to the filesystem are journeled and committed 5 

You can control this.  By default I think only meta data is journaled but this 
can be adjusted at boot time.  With ext3 you can use "chattr" to adjust 
journaling on a per-file basis!

>seconds later. This, I believe, would allow the writes to occur when the 
>journel is recovered upon reboot if the system crashed. Which brings me to 
>the the new system, for some reason It kernel paniced a few time lately. 

If you can, shut down the machine and boot it with a memtest diskette and let 
that run for a few hours.  Assuming you don't have error correcting memory,  
that is usually the cause of kernel panics.

>When RH 7.3 reboots it give the option to do a filesystem check. 
>If I don't say yes, it recovers the journel and goes on. Is this safe? 

Yes.

>Should I still filesystem check after ever unclean boot?

Not really neccessary.  I like to unmount the filesystem and do an fsck 
occasionally anyway, if at some point the machine is idle for any reason.