[KLUG Members] Re: Compiler optimization for CPU arch?

Bruce Smith members@kalamazoolinux.org
18 Dec 2002 08:47:59 -0500


> > Why does Redhat do on on the kernel, glibc & openssl?
> 
> First off, there are other optimizations to the kernel than just GCC.
> Secondly, the kernel and GLibC are executing the most.

Understood.

> > It must make some significant difference.
> 
> Correct.  But your CPU is probably tasked with 75% of those above programs. 
> Although it is monolithic, the kernel is always running.  And GLibC is fully
> threaded and re-entrant, so multiple routines might be getting accessed at any
> one time.

For a workstation, I believe it would make a big difference to also
optimize packages like XF86, Gnome/KDE, Mozilla, or any other GUI app
which gets run a lot.

> OKAY, LET'S LOOK AT THE OPTIONS HERE ...
> 
> 1)  I can GCC "compile-time optimize" the order-of-execution of the CPU
> 
> These usually create code that can run on _any_ CPU, but run best on the
> targetted one.  This _might_ be what some of the i686 packages are.

You are talking about the "-O" option to gcc?
If so, the redhat build process defaults to "-O2".

I don't know if any benefit would come of "-O3",
or if there are potential dangers in using that.

> 2)  I can GCC "target" the full ISA of the CPU
> 
> Now we're using the instruction set architecture (ISA) of the CPU itself, so
> previous generations that don't support the full ISA won't run.  This is
> probably what the "generic" kernels for each "generation" are, as well as other
> apps.

To verify, you're talking about "-march=<cpu>"?

> 3)  I can write code, use special options and other flags to enable various,
> optional CPU ISA code
> 
> In this case, we're talking possibly targeting the 2 x complex + 1 x ADD/MULT
> FPU of the Athlon, instead of just the 1 x complex + 1 x ADD-only FPU of the
> Pentium Pro+.  Or we could be talking MMX, 3DNow!, SSE, etc...  Or various,
> assembler modules for the CPU.  This is what the kernel does for _specific_ CPUs
> (i.e. you can select P2, P3, P4, 6x86MX/M2, C3, Athlon, Curose, etc... --
> specifically).

Yes, I've seen some options to turn off/on MMX and stuff like that.

> And in _each_ of the cases above, the "SPEC" file that builds the RPM _must_
> make sure it accomodates them in the ./Configure script, Makefiles, etc...  If
> the packager doesn't, then it doesn't matter.

Understood.

> > So it's possible (likely?) that Netscape compiled version 7.01 
> > with the Intel compiler?
> 
> Quite possible.  In fact, that's a damn good question.

Well, I checked, and the answer is NO.  They used GCC.
Poking around the binary with "od", it appears they used
GCC version 2.91.66.

> Or they might have just "cleaned up" their user-interface over the Mozilla versions.

Possibly.  Or maybe the performance boost comes from "-march=i686".

BTW, I'm comparing Netscape 7.01 to Redhat 8.0's Mozilla on a
Pentium-III.  (no Athlon in this case)

> > It boils down to this:  I'm looking to recompile some packages with
> > higher CPU optimizations, and possibly replace some of them on BSware.
> 
> I considered that at one time, _but_ if the SPEC file in the RPM doesn't pass
> the right GCC options, it doesn't matter.

Actually, in many cases the SPEC file doesn't have to be touched.
I downloaded a bunch of Mandrake SRPM's and looked at their SPEC files.
There was nothing in there about "i586".  The "--target i586" changes
the build process, passing the correct parameters to "./configure".
(I tested this by rebuilding a few Redhat 8.0 SRPM's)

The "configure" script in the tar file needs to support it, but it
appears that most of them do support the parameter.

> > Given the fact that I do NOT care about i386, i486, and i586 CPU's.
> > I'm only interested in P-II/III/IV and Athlon CPU's.  Do I:
> > A)  Go with i586 only so it runs everywhere I do care about.
> 
> That's what Mandrake does.  It seems to work on all Pentium 6x86MX/M2, K6 and
> later CPUs.

Right.  But how much additional boost would i686 or athlon make?

Personally, I don't think ANY P5 class machine has enough power to run
Redhat 8.0, so I don't care about them.  This leaves P-II/III/IV and
Athlons.  That's what is making the decision difficult.

> > C)  Make both i686 & athlon RPM's, and 2 different BSware distros.
> 
> I'd just make sure any key components are optimized.  But I wouldn't do everything.

Right.  XF86, browsers, window managers, office packages, most GUIs ...

Things like "vi" wouldn't benefit.   :-)

--------------------------------------------
Bruce Smith                bruce@armintl.com
System Administrator / Network Administrator
Armstrong International, Inc.
Three Rivers, Michigan  49093  USA
http://www.armstrong-intl.com/
--------------------------------------------