[KLUG Members] Re: Compiler optimization for CPU arch?

Bryan J. Smith members@kalamazoolinux.org
Tue, 17 Dec 2002 19:31:17 -0500 (EST)


Quoting Bruce Smith <bruce@armintl.com>:
> They why does Mandrake do it?  Just marketing hype?

Er, some.  But most 7th generation Clone CPUs nowdays run Pentium Optimized code
well -- the Athlon and C3.

> Why does Redhat do on on the kernel, glibc & openssl?

First off, there are other optimizations to the kernel than just GCC.
Secondly, the kernel and GLibC are executing the most.

And, third, I didn't know but guess OpenSSL performance is really boosted by
some instructions (MMX?)?

> It must make some significant difference.

Correct.  But your CPU is probably tasked with 75% of those above programs. 
Although it is monolithic, the kernel is always running.  And GLibC is fully
threaded and re-entrant, so multiple routines might be getting accessed at any
one time.

> To which question?  :-)

Both.  i686/PPro code _might_ run fine on Athlon without issue, but it might not
give a major boost.  I could be wrong.  I've seen some of the latest Intel
compiler optimizations give boosts on Athlons.

OKAY, LET'S LOOK AT THE OPTIONS HERE ...

1)  I can GCC "compile-time optimize" the order-of-execution of the CPU

These usually create code that can run on _any_ CPU, but run best on the
targetted one.  This _might_ be what some of the i686 packages are.

2)  I can GCC "target" the full ISA of the CPU

Now we're using the instruction set architecture (ISA) of the CPU itself, so
previous generations that don't support the full ISA won't run.  This is
probably what the "generic" kernels for each "generation" are, as well as other
apps.

3)  I can write code, use special options and other flags to enable various,
optional CPU ISA code

In this case, we're talking possibly targeting the 2 x complex + 1 x ADD/MULT
FPU of the Athlon, instead of just the 1 x complex + 1 x ADD-only FPU of the
Pentium Pro+.  Or we could be talking MMX, 3DNow!, SSE, etc...  Or various,
assembler modules for the CPU.  This is what the kernel does for _specific_ CPUs
(i.e. you can select P2, P3, P4, 6x86MX/M2, C3, Athlon, Curose, etc... --
specifically).

And in _each_ of the cases above, the "SPEC" file that builds the RPM _must_
make sure it accomodates them in the ./Configure script, Makefiles, etc...  If
the packager doesn't, then it doesn't matter.

> So having both *.i686.rpm and *.athlon.rpm for some packages 
> would be a good thing?

The kernel, GLibC, crytpo, multimedia ... yes.  Again, RedHat seems to be
planning an AMD distro.  Not just for x86-64 Athlon-64/Opteron, but optimized
for Athlon, both 32-bit and 64-bit, in general (along with the 64-bit core
kernel and system libraries for full 64-bit support on key components).

> So it's possible (likely?) that Netscape compiled version 7.01 
> with the Intel compiler?

Quite possible.  In fact, that's a damn good question.

Or they might have just "cleaned up" their user-interface over the Mozilla versions.

> Huh?  On my Athlon at home:  
> # rpm -qa --qf="%{ARCH} %{NAME}\n" \
> 	| grep -v -e ^i386 -e ^noarch
> athlon mplayer
> i586 xfce
> i686 RealPlayer
> i686 glibc
> i686 openssl
> athlon kernel
> That leads me to believe that i686 RPMs runs fine on Athlon's since 
> Redhat's installer installed some of those i686 packages for me.

Again, the Athlon might run i686/PPro code just fine *IF* #1 above is the
targetting.  If it is #2, I'm not so sure, but probably.

And remember, RPM "architecture" != GCC "architecture."  It's up to the packager
to accomodate what the original developer did (if at all).

> Okay.

I distinctly remember reading that both the kernel Athlon optimizations as well
as the Athlon target of GCC 3.1x has various AMD-only codings.

In the case of the kernel, compiling for the P6/Pentium-III or 4 won't work on
the Athlon either.

> So, will an i686.rpm package run on a P5?

Er, not likely because the Pentium doesn't support the Pentium Pro ISA.

> I haven't found anything useful in the docs.  I'm dreading going
> through the mailing list archives.  Was hoping for a nice article
> about it.  :-)

Again, you have to differentiate _Compiler_ optimizations from _Package_ options.

> It boils down to this:  I'm looking to recompile some packages with
> higher CPU optimizations, and possibly replace some of them on BSware.

I considered that at one time, _but_ if the SPEC file in the RPM doesn't pass
the right GCC options, it doesn't matter.

> Given the fact that I do NOT care about i386, i486, and i586 CPU's.
> I'm only interested in P-II/III/IV and Athlon CPU's.  Do I:
> A)  Go with i586 only so it runs everywhere I do care about.

That's what Mandrake does.  It seems to work on all Pentium 6x86MX/M2, K6 and
later CPUs.

> B)  Go with i686 only. (which should run on P6 & Athlon both?)

Again, not sure.  E.g., when you pass "--target i686" to RPM, does it really
target the P6/PPro ISA?  Or just enables some P6/PPro order optimizations, but
still runs on a i386/486?

> C)  Make both i686 & athlon RPM's, and 2 different BSware distros.

I'd just make sure any key components are optimized.  But I wouldn't do everything.

With x86-64, general Athlon optimizations are going to become _very_ commonplace.

-- 
Bryan J. Smith, E.I. (BSECE)       Contact Info:  http://thebs.org
[ http://thebs.org/files/resume/BryanJonSmith_certifications.pdf ]
------------------------------------------------------------------
*** Whether it is voting on butterfly ballots or driving under ***
*** overpasses, Floridians just can't seem to do things right. ***