[KLUG Members] Re: Compiler optimization for CPU arch?
Bruce Smith
members@kalamazoolinux.org
18 Dec 2002 08:47:59 -0500
> > Why does Redhat do on on the kernel, glibc & openssl?
>
> First off, there are other optimizations to the kernel than just GCC.
> Secondly, the kernel and GLibC are executing the most.
Understood.
> > It must make some significant difference.
>
> Correct. But your CPU is probably tasked with 75% of those above programs.
> Although it is monolithic, the kernel is always running. And GLibC is fully
> threaded and re-entrant, so multiple routines might be getting accessed at any
> one time.
For a workstation, I believe it would make a big difference to also
optimize packages like XF86, Gnome/KDE, Mozilla, or any other GUI app
which gets run a lot.
> OKAY, LET'S LOOK AT THE OPTIONS HERE ...
>
> 1) I can GCC "compile-time optimize" the order-of-execution of the CPU
>
> These usually create code that can run on _any_ CPU, but run best on the
> targetted one. This _might_ be what some of the i686 packages are.
You are talking about the "-O" option to gcc?
If so, the redhat build process defaults to "-O2".
I don't know if any benefit would come of "-O3",
or if there are potential dangers in using that.
> 2) I can GCC "target" the full ISA of the CPU
>
> Now we're using the instruction set architecture (ISA) of the CPU itself, so
> previous generations that don't support the full ISA won't run. This is
> probably what the "generic" kernels for each "generation" are, as well as other
> apps.
To verify, you're talking about "-march=<cpu>"?
> 3) I can write code, use special options and other flags to enable various,
> optional CPU ISA code
>
> In this case, we're talking possibly targeting the 2 x complex + 1 x ADD/MULT
> FPU of the Athlon, instead of just the 1 x complex + 1 x ADD-only FPU of the
> Pentium Pro+. Or we could be talking MMX, 3DNow!, SSE, etc... Or various,
> assembler modules for the CPU. This is what the kernel does for _specific_ CPUs
> (i.e. you can select P2, P3, P4, 6x86MX/M2, C3, Athlon, Curose, etc... --
> specifically).
Yes, I've seen some options to turn off/on MMX and stuff like that.
> And in _each_ of the cases above, the "SPEC" file that builds the RPM _must_
> make sure it accomodates them in the ./Configure script, Makefiles, etc... If
> the packager doesn't, then it doesn't matter.
Understood.
> > So it's possible (likely?) that Netscape compiled version 7.01
> > with the Intel compiler?
>
> Quite possible. In fact, that's a damn good question.
Well, I checked, and the answer is NO. They used GCC.
Poking around the binary with "od", it appears they used
GCC version 2.91.66.
> Or they might have just "cleaned up" their user-interface over the Mozilla versions.
Possibly. Or maybe the performance boost comes from "-march=i686".
BTW, I'm comparing Netscape 7.01 to Redhat 8.0's Mozilla on a
Pentium-III. (no Athlon in this case)
> > It boils down to this: I'm looking to recompile some packages with
> > higher CPU optimizations, and possibly replace some of them on BSware.
>
> I considered that at one time, _but_ if the SPEC file in the RPM doesn't pass
> the right GCC options, it doesn't matter.
Actually, in many cases the SPEC file doesn't have to be touched.
I downloaded a bunch of Mandrake SRPM's and looked at their SPEC files.
There was nothing in there about "i586". The "--target i586" changes
the build process, passing the correct parameters to "./configure".
(I tested this by rebuilding a few Redhat 8.0 SRPM's)
The "configure" script in the tar file needs to support it, but it
appears that most of them do support the parameter.
> > Given the fact that I do NOT care about i386, i486, and i586 CPU's.
> > I'm only interested in P-II/III/IV and Athlon CPU's. Do I:
> > A) Go with i586 only so it runs everywhere I do care about.
>
> That's what Mandrake does. It seems to work on all Pentium 6x86MX/M2, K6 and
> later CPUs.
Right. But how much additional boost would i686 or athlon make?
Personally, I don't think ANY P5 class machine has enough power to run
Redhat 8.0, so I don't care about them. This leaves P-II/III/IV and
Athlons. That's what is making the decision difficult.
> > C) Make both i686 & athlon RPM's, and 2 different BSware distros.
>
> I'd just make sure any key components are optimized. But I wouldn't do everything.
Right. XF86, browsers, window managers, office packages, most GUIs ...
Things like "vi" wouldn't benefit. :-)
--------------------------------------------
Bruce Smith bruce@armintl.com
System Administrator / Network Administrator
Armstrong International, Inc.
Three Rivers, Michigan 49093 USA
http://www.armstrong-intl.com/
--------------------------------------------