64-bit CPU vs 2 x 32-bit CPUs

64-bit CPU vs 2 x 32-bit CPUs

Post by Yin-Chih L » Fri, 17 Jul 1992 12:56:18



Greetings:

I have a discussion about whether the single 64-bit CPU (ex. MIPS R4000,
DEC Alpha, or in the future of SPARC V9 architecture) or two 32-bit CPUs
(ex. MIPS R3000, IBM RS6000, HP PA-RISC, SPARC V8 arch, Intel i860 etc)
will have better CPU performance (accodring the def. of David Patterson &
John Hennessy's book) from the aspects:

        o yield rate of fab.
        o cost
        o cache coherence
        o bus arch.
        o bus bandwidth
        o memory bandwidth
        o mutual exlusion
        o OS design
        o compiler desing
        o user program desing
        o etc.

However, we don't have concrete experience in all these fields, so we would
appreciate if anyone could drop us some recommendations.

Thanks!

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Yin-Chih Lin                            Industrial Technology Research

phone: 886-35-917331
fax:   886-35-917503                    Computer & Comm. Research Labs.(CCL)
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 
 
 

64-bit CPU vs 2 x 32-bit CPUs

Post by Bob Supn » Fri, 17 Jul 1992 09:32:14



Quote:>Greetings:

>I have a discussion about whether the single 64-bit CPU (ex. MIPS R4000,
>DEC Alpha, or in the future of SPARC V9 architecture) or two 32-bit CPUs
>(ex. MIPS R3000, IBM RS6000, HP PA-RISC, SPARC V8 arch, Intel i860 etc)
>will have better CPU performance (accodring the def. of David Patterson &
>John Hennessy's book) from the aspects:

>    o yield rate of fab.
>    o cost
>    o cache coherence
>    o bus arch.
>    o bus bandwidth
>    o memory bandwidth
>    o mutual exlusion
>    o OS design
>    o compiler desing
>    o user program desing
>    o etc.

>However, we don't have concrete experience in all these fields, so we would
>appreciate if anyone could drop us some recommendations.

If you measure the area increment in an Alpha 21064 or a Mips R4000 for a
64-bit integer data path vs a 32-bit integer data path (floating point is
64-bit in any case), you will see that the area increment is less than 10%.
So it's not possible to build two 32-bit (equivalent performance/
functionality) chips for the silicon area (=cost) of a 64-bit chip.

The size of the integer data path (and virtual address) has little to no
relationship to the bus architecture, bus bandwidth, memory bandwidth,
or anything else on the list.  For example, the NVAX chip (a 32-bit chip)
has a 128-bit dedicated bus to its secondary cache, and a 64-bit bus to
its memory, exactly the same as the R4000 (a 64-bit chip).

Both the 64-bit chips mentioned support implementation of 32-bit operating
systems, or 32-bit environments within 64-bit operating systems, thereby
allowing identical pointer and data sizes as a 32-bit system.  For example,
UNIX on the R4000, and VMS on Alpha, and NT on both, are today 32-bit
operating systems with 32-bit user environments (although this may not
necessarily be true in the future).

My conclusion: for >>high performance<< microprocessors, the difference
(today) between 64-bit and 32-bit implementation is a less than 10% silicon
area cost, end of story; and it will get less over time.  (Cost-focussed and
embedded microprocessors would be a different story.)


                >All opinions expressed are those of a hardline microcoder
                >and do not reflect those of Digital Equipment Corporation

 
 
 

64-bit CPU vs 2 x 32-bit CPUs

Post by Yin-Chih L » Sat, 18 Jul 1992 02:36:11



>If you measure the area increment in an Alpha 21064 or a Mips R4000 for a
>64-bit integer data path vs a 32-bit integer data path (floating point is
>64-bit in any case), you will see that the area increment is less than 10%.
>So it's not possible to build two 32-bit (equivalent performance/
>functionality) chips for the silicon area (=cost) of a 64-bit chip.

Maybe true. But for a user, should he buy one $2000 21064 chip or another
two $1000 CY7C601!?

If he chooses the first, he can advert that the 64-bit CPU could cover
all the 32-bit operations. But when this single 64-bit CPU crashed, then
send him to hell.

Certainly, when you using the 150MHz 21064, you might get the performance
to 200 MIPS. But could I beat such beast by 2 low-performance CPUs?

Quote:>The size of the integer data path (and virtual address) has little to no
>relationship to the bus architecture, bus bandwidth, memory bandwidth,
>or anything else on the list.

I think it is obvious, the *data path* has little to do with the bus
architecture. But what I concern *is not* the comparison between the
*single 64-bit* to *single 32-bit*. All I want to know is about the
feasibility of two or more CPUs to challenge one high performance
CPU. Just when you implement CPU pool on your machine, could you say
that the bus architecture really does not have any relation to CPUs?
(I would like to rectify, since I am confused why so many *buses*
existed)

Quote:>For example,
>UNIX on the R4000, and VMS on Alpha, and NT on both, are today 32-bit
>operating systems with 32-bit user environments (although this may not
>necessarily be true in the future).

I believe, tuning the 32-bit OS to 64-bit environment would not induce
much difficulty. However, tell me how to do when you try to port the
single CPU OS to multiprocessor OS. Sun Micro had shipped some MP
platforms, asking them if the OS should need to recode and redesing
if necessary (multithread in SunOS 5.0 ?!).

You can make only the kernel knows well about the under layer h/w arch.,
but if could also explore the parallelism in the compiler level.
So I believe, that the best compiler should reflect the actual computer
architectuer even it can shield such detail from the users.

Someone says that if you need 64-bit addresses or arith. quantities of
64-bits then there's no question. But since 64-bit operations can be
also perform by the 32-bit algorithms, I would wonder how really
advantage that singe 64-bit CPU would obtain when compares two 32-bit
CPUs.

Yin-Chih Lin

 
 
 

64-bit CPU vs 2 x 32-bit CPUs

Post by Chuck Parso » Sun, 19 Jul 1992 04:03:00




>>Greetings:

>>I have a discussion about whether the single 64-bit CPU (ex. MIPS R4000,
>>DEC Alpha, or in the future of SPARC V9 architecture) or two 32-bit CPUs
>>(ex. MIPS R3000, IBM RS6000, HP PA-RISC, SPARC V8 arch, Intel i860 etc)
>>will have better CPU performance (accodring the def. of David Patterson &
>>John Hennessy's book) from the aspects:

>>        o yield rate of fab.
>>        o cost
>>        o cache coherence

   < rest of list deleted >
Quote:

>>However, we don't have concrete experience in all these fields, so we would
>>appreciate if anyone could drop us some recommendations.

>If you measure the area increment in an Alpha 21064 or a Mips R4000 for a
>64-bit integer data path vs a 32-bit integer data path (floating point is
>64-bit in any case), you will see that the area increment is less than 10%.
>So it's not possible to build two 32-bit (equivalent performance/
>functionality) chips for the silicon area (=cost) of a 64-bit chip.

  However, by going to two chips the cache would be divided into two
parts. Since cache is often 50% of the die each chip could probably
be done with only 60-70% (100%- your 10% - 20+% for half of cache thats
on the other chip) of the transistors of the full 64 bit chip.
This, doesn't sound like a win to me since carries are going off chip and
the cache is fragmented, but I don't think each chip would have 90%
of the area required for a full 64 bit chip.

Quote:>The size of the integer data path (and virtual address) has little to no
>relationship to the bus architecture, bus bandwidth, memory bandwidth,
>or anything else on the list.  For example, the NVAX chip (a 32-bit chip)
>has a 128-bit dedicated bus to its secondary cache, and a 64-bit bus to
>its memory, exactly the same as the R4000 (a 64-bit chip).

  Yes but in this case two units would be operational at once. So to
get the same cache/memeory bandwidth as the 64bit chip each of the 32
bit chips would only have to be 64bit to secondary cache and 32bit
to memory. Or you could leave them the same size but double the
bandwidth, which should improve performance. I don't think
the bandwidth would make up for the off chip delays necessary to s
coordinate the two chips operation, but thats a different problem.

Quote:>So it's not possible to build two 32-bit (equivalent performance/
>functionality) chips for the silicon area (=cost) of a 64-bit chip.

  I thought that for big chips, cost rose faster than linearly with
area due to yield problems. That is it might be cheaper to make
two 1.5cm^2 chips than one 2cm^2 chip.


 
 
 

64-bit CPU vs 2 x 32-bit CPUs

Post by Andy Gl » Sat, 25 Jul 1992 11:18:43


    Maybe true. But for a user, should he buy one $2000 21064 chip or another
    two $1000 CY7C601!?

This is a rather bogus discussion.

The assumption seems to be implicit that, given the same technology,
etc., a 64 bit architecture implies twice the performance of a 32 bit
architecture.

That's blatantly untrue, since the overwhelming majority of applications
fit quite nicely into 32 bit address and data quantities, so from this
point of view 32 = 64.

There may be some applications that can benefit from >32 bit virtual
addresses, e.g. the 40 bit virtual addresses of the MIPS R4000. I'm
tempted to say that this implies 32 = 4/5 40, but of course there is
overhead in evaluating extended precision arithmetic. The overhead may
be more than 2x.  But then you have to downgrade that by the ratio of
time spent manipulating such addresses.

Finally, there are some applications that benefit from simply having
larger integers.  John Mashey has mentioned robotics as one such area
(although I have talked to robotics manufacturers who take a
completely different approach). But, once again, you have to prorate
the speedup according to the importance of the code being executed.

Look:
    Whether you will see a speedup between a 32 bit and a 64 bit
    microprocessor depends on your applications and OS.

    I am 100% certain that none of the applications or OSes
    I am using *today* would be speeded up by 64 bits.

    Some of the applications I will want to use in a few years
    will undoubtedly run faster (or, more likely, be easier to code)
    because of 64 bits -- like the oft-awaited global flat namespace.
    But I won't hold my breath (at least not until ISDN).

Rephrasing the question above:

    But for a user, should he buy one 64 bit chip or 2 32 bit chips?

As:

    But for a user, should he buy one 64 bit chip or 2 32 bit chips
    (assuming both are implemented in comparable technology)?

The present day answer is probably:

    If you can save money, and if you don't need >32 bit virtual addresses
    because the applications that use them aren't developed yet,
    then you are probably better off buying the cheaper 32 bit chip.

--


Intel Corp., M/S JF1-19, 5200 NE Elam Young Pkwy,
Hillsboro, Oregon 97124-6497

This is a private posting; it does not indicate opinions or positions
of Intel Corp.

Intel Inside (tm)

 
 
 

64-bit CPU vs 2 x 32-bit CPUs

Post by Herman Rub » Sat, 25 Jul 1992 22:27:59



>    Maybe true. But for a user, should he buy one $2000 21064 chip or another
>    two $1000 CY7C601!?
>This is a rather bogus discussion.
>The assumption seems to be implicit that, given the same technology,
>etc., a 64 bit architecture implies twice the performance of a 32 bit
>architecture.
>That's blatantly untrue, since the overwhelming majority of applications
>fit quite nicely into 32 bit address and data quantities, so from this
>point of view 32 = 64.

Of course it is blatantly untrue, unless you can do the 64-bit operations
and accesses as fast as 32 bit ones.  On at least one machine I am now
using, nobody should ever use the "single precision" (really half
precision) floating point arithmetic, because the hardware always
converts to double and then back, unless memory is at a drastic premium.

Quote:>There may be some applications that can benefit from >32 bit virtual
>addresses, e.g. the 40 bit virtual addresses of the MIPS R4000. I'm
>tempted to say that this implies 32 = 4/5 40, but of course there is
>overhead in evaluating extended precision arithmetic. The overhead may
>be more than 2x.  But then you have to downgrade that by the ratio of
>time spent manipulating such addresses.
>Finally, there are some applications that benefit from simply having
>larger integers.  John Mashey has mentioned robotics as one such area
>(although I have talked to robotics manufacturers who take a
>completely different approach). But, once again, you have to prorate
>the speedup according to the importance of the code being executed.

It is clear that you do not do any type of "honest" integer arithmetic,
or you could not possibly take this view.  

It would be rare indeed for a 32-bit machine to be able to do 64-bit
arithmetic at anywhere near the speed of a 64-bit machine.
--
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054

{purdue,pur-ee}!pop.stat!hrubin(UUCP)

 
 
 

64-bit CPU vs 2 x 32-bit CPUs

Post by Gideon Yuv » Sun, 26 Jul 1992 01:33:22



>Finally, there are some applications that benefit from simply having
>larger integers.  John Mashey has mentioned robotics as one such area
>(although I have talked to robotics manufacturers who take a
>completely different approach). But, once again, you have to prorate

Mashey says the robotics guys' reason for using 64-bit integers is that
they don't trust 53-bit mantissas. Given this reason, I wouldn't
expect an Intel man (selling 80-bit X87 math) to have that issue
come up.
--

 
 
 

64-bit CPU vs 2 x 32-bit CPUs

Post by John Mash » Sun, 26 Jul 1992 08:04:22



>    Maybe true. But for a user, should he buy one $2000 21064 chip or another
>    two $1000 CY7C601!?

>This is a rather bogus discussion.

....

Yes:
1) For many applications, one will happily leave them as 32-bit applications,
because they'll probably run faster in 32-bit model, so why mess with
them?

2) A few applications exist where pushing 64-bit integers around
conveniently might have good gains, either due to 64x64->128
multiplies, or just puhsing data 2X faster conveniently.

3) The main reason is to get more address sapce (conveniently).
There are not a *huge* number of these things; however, the ones that
are there are *extremely* important to the people who use them, as they
are things like:
        1) Scientific codes
                "Goody; we expand our FORTRAN arrays by factor of 10"
        2) Some ECAD programs
                "Goody, we can still simulate the R11000 after all"
        3) Some MCAD programs
                "Great, we can simulate the 199x automobile in one piece."
        4) Video& animation
                "Great, let's get put Terminator 6 in memory for editing"
        5) Financial
                "Good, we can finally put the financial model of the US
                in memory and grep around in it at speed."
        6) DBMS
                "Good, we can map 4 whole 1GB SCSI disks into memory at once",
                i.e., disks that fit in a desktop box.
        7) CASE
                "Thank goodness, there's still space for the new EMACS" :-)

o

Quote:>    Some of the applications I will want to use in a few years
>    will undoubtedly run faster (or, more likely, be easier to code)
>    because of 64 bits -- like the oft-awaited global flat namespace.
>    But I won't hold my breath (at least not until ISDN).

Each to their own; fortunately for us, plenty of people want this
in 1993.  I'd guess the "just recompile this FORTRAN program bigger"
cases will get there quickest.

As I've noted several times before, having 64-bit integers cost
something like 5% of the die space, so it's not a big deal.
Certainly, a 64-bit chip doesn't automatically (or even usually)
go 2X faster than a 32-bit chip.
--
-john mashey    DISCLAIMER: <generic disclaimer, I speak for me only, etc>

DDD:    408-524-7015,  or 524-8253
USPS:   (soon) Silicon Graphics, 2011 N. Shoreline Blvd, Mountain View, CA 94043

 
 
 

64-bit CPU vs 2 x 32-bit CPUs

Post by Bob Supn » Sun, 26 Jul 1992 10:56:56



>    Maybe true. But for a user, should he buy one $2000 21064 chip or another
>    two $1000 CY7C601!?

>This is a rather bogus discussion.

>The assumption seems to be implicit that, given the same technology,
>etc., a 64 bit architecture implies twice the performance of a 32 bit
>architecture.

This is indeed bogus, but the assumption being made is that a 64-bit chip

Quote:>>costs<< twice what a 32-bit chip of comparable capability would cost.

As I pointed out originally, from measuring both the R4000 and the Alpha chip
die, the 64-bit (vs 32-bit) integer/addressing data paths adds 10% (or less)
to die area.  This will NOT drive a 2X difference in costs; more like 1.2X.

Clearly, by comparing chips of radically different capabilities (for example,
an R4000 versus an R3000, or an Alpha versus a CVAX), it is possible to
create totally arbitrary cost ratios.  As in the quoted example.  Within
a given process, and for like capabilities, die area drives yield drives cost.


                >All opinions expressed are those of a hardline microcoder
                >and do not reflect those of Digital Equipment Corporation

 
 
 

64-bit CPU vs 2 x 32-bit CPUs

Post by petteng.. » Sun, 26 Jul 1992 14:59:44


|>2) A few applications exist where pushing 64-bit integers around
|>conveniently might have good gains, either due to 64x64->128
|>multiplies, or just puhsing data 2X faster conveniently.

Not all 64-bit integer operations involve numbers.  I don't know about
other architectures, but the Alpha architecture includes an instruction
or two that allows character compares and related operations to be done
8 octets at a time.  In particular, CMPBGE does 8 bytewise compares in
parallel.

I'm sure that there are a number of other places where 64-bit registers
can be put to good use even when the address space and the integers in
general only use 32 or maybe even 16 bits.  (I've been thinking about
the design of a PDP-11 emulator, and one option is to put all the general
registers in two 64 bit registers...)

mulp
DEC

 
 
 

64-bit CPU vs 2 x 32-bit CPUs

Post by Othman Ahm » Mon, 27 Jul 1992 00:15:37


:

: >
: >    Maybe true. But for a user, should he buy one $2000 21064 chip or another
: >    two $1000 CY7C601!?
: As I pointed out originally, from measuring both the R4000 and the Alpha chip
: die, the 64-bit (vs 32-bit) integer/addressing data paths adds 10% (or less)
: to die area.  This will NOT drive a 2X difference in costs; more like 1.2X.
Which die area are you measuring? Have you taken into account the area taken
by the pads? The pads are huge and require large drivers and protection
circuits.
        The packaging also increases. You need 471 pins for Alpha. 64-bit
alpha requires similar number(can't remember).

--
Othman bin Ahmad, School of EEE,
Nanyang Technological University, Singapore 2263.


 
 
 

64-bit CPU vs 2 x 32-bit CPUs

Post by Donald Linds » Mon, 27 Jul 1992 12:26:36



>Which die area are you measuring? Have you taken into account the area taken
>by the pads? The pads are huge and require large drivers and protection
>circuits.

Since the DEC 21064 Alpha chip only supports 43-bit effective virtual
addresses, the cost is approximately 43-32=11 pins.

Since the package has 431 pins, I don't think that 11 pins was some
sort of horrendous burden that changed everything.
--
Don             D.C.Lindsay     Carnegie Mellon Computer Science

 
 
 

64-bit CPU vs 2 x 32-bit CPUs

Post by Ed Gou » Mon, 27 Jul 1992 13:45:48


Quote:> Since the DEC 21064 Alpha chip only supports 43-bit effective virtual
> addresses, the cost is approximately 43-32=11 pins.

According to the data sheet I have, the 21064 implements 43-bit virtual
addresses (even though all 64 bits are checked), but only 34-bit
physical addresses.  Hence, it's only *two* pins.  I can't imagine that
that's particularly significant.

--

+1 415 688 1309   Network Systems Lab   505 Hamilton Ave, Palo Alto, CA  94301

"Unison is only one form of harmony." -- LW

 
 
 

64-bit CPU vs 2 x 32-bit CPUs

Post by Othman Ahm » Mon, 27 Jul 1992 13:39:11


:

: >Which die area are you measuring? Have you taken into account the area taken
: >by the pads? The pads are huge and require large drivers and protection
: >circuits.
:
: Since the DEC 21064 Alpha chip only supports 43-bit effective virtual
: addresses, the cost is approximately 43-32=11 pins.
How about the 64 pin program and 64 pin data?

:
: Since the package has 431 pins, I don't think that 11 pins was some
: sort of horrendous burden that changed everything.
If it needs only 11 pins, why do you need 431 pins? Why not just 132+11?
80386 uses 132 pins.

--
Othman bin Ahmad, School of EEE,
Nanyang Technological University, Singapore 2263.


 
 
 

64-bit CPU vs 2 x 32-bit CPUs

Post by Donald Linds » Tue, 28 Jul 1992 02:41:03



>: Since the DEC 21064 Alpha chip only supports 43-bit effective virtual
>: addresses, the cost is approximately 43-32=11 pins.
>How about the 64 pin program and 64 pin data?
>: Since the package has 431 pins, I don't think that 11 pins was some
>: sort of horrendous burden that changed everything.
>If it needs only 11 pins, why do you need 431 pins? Why not just 132+11?
>80386 uses 132 pins.

You are confusing implementation and architecture. The 386 and its
follow-ons have (so far) nearly identical ISA, but we can be sure
that the 586 and 686 will have an increased pincount. This is because
their mandate is high performance. The 21064's data bus is 128 bits
wide, and Alpha instructions are 32 bits:  clearly neither number was
driven by the choice of a 64 bit integer unit.

The original question was, what is the cost of changing the
architecture from 32 bits to 64?  The silicon answer is, today, ~~10%
more. The pin answer is, well, implementations might add another pin
or two to their physical-address bus. [In the quote above, I
incorrectly gave the number of virtual address bits, whereas physical
addresses emerge from the 21064. My apology.]

The point of undertaking this small cost _now_, instead of "someday",
is that chips have to be designed well ahead of need. As time passes,
more and more customers will find reasons to change up, and these are
precisely the customers that you don't want to lose. DEC may already
be hearing complaints from Cray about "only" allowing 34 address
lines.
--
Don             D.C.Lindsay     Carnegie Mellon Computer Science

 
 
 

1. WANTED: algorithm to perform 64-bit / 32-bit signed & unsigned divide

I'm looking for an algorithm to perform an integer 64-bit / 32-bit
signed divide, and a 64-bit / 32-bit unsigned divide on the PowerPC.
I don't want to use the "DIV" instruction, which is not part of the
architecture.  I need the result, the remainder, and the overflow flag.

The machine provides only 32-bit / 32-bit signed and unsigned divide,
32-bit x 32-bit multiply, and 32-bit x 32-bit multiply and fetch high-order
32-bits.  It also has fast IEEE single and double precision floating point.
The compiler also supports "long double", 128-bit floating point.

I would like the algorithm to have good performance, and be written
in either PowerPC assembler or C.  Please reply via email.

                                regards,
                                joe

--
Full Name:    Joseph M. Orost

Organization: AT&T Bell Laboratories: FlashPort Services
SurfaceMail:  943 Holmdel Rd.; Cruz Plaza; Holmdel, NJ 07733
Phone:        +1 (908) 946-1115

2. request cd r software!

3. Filteralgorithm in C for fixed point DSP

4. bring 32-bit executable to 64-bit processor (intel processor)

5. who can help me to prepare the "xemacs presentation"

6. 64-bit chips, 32-bit compatibility?

7. Anyone install Xerces-Perl on HP-UX?

8. 32-bit versus 64-bit compares--(WAS: What is a 64bit system?)

9. 64 bit Alpha AXP -vs- 32 bit computers

10. Affordable and reliable IrDA infrared communications for 8/16/32/64 bit CPU's

11. 64 bit integer divide using 32 bit divider

12. Is the 64 bit ultra SPARC binary compatible with 32 bits SPARC v8?