## 64 bit multiply-accumulate

### 64 bit multiply-accumulate

I have been unsuccessfully looking for a simple 64 bit MAC device.
I need:
1) this DSP or micro to be on a PC104 card
2) Linux support (or enough docs to hack)

My algorithm is very simple:
int x;      // an array of 32 bit integers
int xt;           // 32 bit integer
long sum=0;       // a 64 bit integer
int i;

for (i=0; i<4000; i++) {
xt = x[i];       // in assembler this would be mem->register
sum = sum + xt;  // 64bit accumulator plus 32bit register to 64bit
accumulator

Quote:}

Now I know technically a MAC op might want the following:
sum = sum*1 + xt;

Is there anything easy and cheap out there that will do this?  I
thought MMX commands would do this, but I think the MMX ALU is
strictly 32 bit operands.

### 64 bit multiply-accumulate

Quote:> I have been unsuccessfully looking for a simple 64 bit MAC device.
> I need:
> 1) this DSP or micro to be on a PC104 card
> 2) Linux support (or enough docs to hack)

> My algorithm is very simple:
> int x;      // an array of 32 bit integers
> int xt;           // 32 bit integer
> long sum=0;       // a 64 bit integer
> int i;

> for (i=0; i<4000; i++) {
>   xt = x[i];       // in assembler this would be mem->register
>   sum = sum + xt;  // 64bit accumulator plus 32bit register to 64bit
> accumulator
> }

> Now I know technically a MAC op might want the following:
>   sum = sum*1 + xt;

> Is there anything easy and cheap out there that will do this?  I
> thought MMX commands would do this, but I think the MMX ALU is
> strictly 32 bit operands.

If you're not actually doing a multiply, then just about any
(with the low word of the sum) with an add-with-carry of zero to
the high word.  Processors like MIPS, which don't have carry
flags can do essentially the same thing with comparisons.

In other words, I don't think that you've explained enough
adder or MAC unit.  If you actually do need such a thing,
anything based on the ARM9e core can do it for you.  I'm sure
that there are others.

--
Andrew

In last weeks COMPCOM presentation of the RS6000, the
IBM speaker was asked if the new, fast fp multiply/
accumulate instruction was IEEE 754 compatible.  The
reply was 'no.'  If anyone (IBM?) has any information,
how about commenting on some questions?

1. Is this instruction incompatible simply because such
an instruction is not described by the IEEE 754 fp
standard? (and would it be useful if it was?  How

2. Is it incompatible because it is impossible to generate
a result which is equal to the result of a separate
multiply then add with the selected rounding mode and
destination size?  Or is it too difficult/complex/costly?

3. On the RS6000 - the double-precision Linpack numbers were
quite impressive if the multiply/accumulate unit was
allowed to operate on both fp ops in one pass.  What is
the number if the unit must be cycled twice (once for mul
and again for add) if that is required to generate IEEE
precise results?

Thanks for any answers.  If the replies are to me directly, I
will post a summary if it looks like the group would benefit.

*************************************************
*   Motorola Microprocessor Products Sector     *
*   Austin, Tx                                  *
*                                               *
*   Chris N. Hinds <><      Standard Disclamers *

*************************************************