32-bit fractional multiplication with cross-multiplication method (no 64-bit intermediate result)

I am programming a fixed point speech enhancement algorithm on a 16 bit processor. At some point I need to perform a 32 bit fractional multiplication. I have read other posts about doing 32 bit byte by byte multiplication and I understand why this works for Q0.31 formats. But I am using different Q-formats with different number of fractional bits.

So I found out that for fractional bits less than 16, this works:

(low*low >> N) + low*high + high*low + (high*high << N)

      

where N is the number of fractional bits. I have read that the result low*low

should be unsigned as well as low bytes. In general, this gives exactly the result I want in any Q format with less than 16 fractional bits.

Now it gets trickier when the fractional bits are greater than 16. I tried several shift numbers, different shifts for low*low

and high*high

I tried to put it on paper but I can't figure it out.

I know this can be very simple, but the whole idea eludes me and I would appreciate some comments or recommendations!

+3


source to share


2 answers


This is the same formula. For N> 16, the shifts mean you are throwing away an entire 16-bit word, which would be excessive or unacceptable. low * low -> N means just shift N-16 bits in the high word of a 32 bit multiplication result and append to the low word of the result. high * high <N means just use the low word of the multiplied result, left-shifted N-16 and add to the top word of the result.



0


source


There are several ideas in the game.

First, multiply 2 short integers to get a longer product. Consider the unsigned multiplication of 2 32-bit integers by multiplying their 16-bit "halves", each of which creates a 32-bit product, and the overall product creates a 64-bit one:

  a * b = (a_hi * 2 16 + a_lo) * (b_hi * 2 16 + b_lo) =

  a_hi * b_hi * 2 32 + (a_hi * b_lo + a_lo * b_hi) * 2 16 + a_lo * b_lo.

Now, if you need a signed multiplication, you can build it from an unsigned multiplication (like from the one above).

Suppose <0 and b> = 0, a * signed b should be equal

  2 64 - ((-a) * unsigned b), where

  -a = 2 32 - a (because it's 2's complement)

IOW,

  a * signed b =

  2 64 - ((2 32 - a) * unsigned b) =

  2 64 + (a * unsignedb) - (b * 2 32 ) where 2 64 can be dropped since we are only using 64 bits.

Similarly, you can calculate * signed b for a> = 0 and b <0 and should get a symmetric result:



  (a * unsignedb) - (a * 2 32 )

In a similar way, it can be shown that for <0 and b <0, a signed multiplication can be constructed at the top of an unsigned multiplication as follows:

  (a * unsignedb) - ((a + b) * 2 32 )

So you first multiply a and b as unsigned, then if a <0, you subtract b from the top 32 bits of the product, and if b <0, you subtract from the top 32 bits of the product made.


Now that we can multiply 32-bit integers and get 64-bit subscription products, we can finally get down to fractional stuff.

Suppose now that of these 32 bits in and b, N bits are used for the fractional part. This means that if you look at a and b as simple integers, they will have 2 N times more than they actually represent, for example. 1.0 will look like 2 N (or 1 <N).

So, if you multiply two such integers, then the product will be 2 N * 2 N = 2 2 * N times more than it should represent, for example 1.0 * 1.0 would look like 2 2 * N (or 1 <(2 * N) ). IOW, multiplying a prime integer doubles the number of fractional bits. If you want a product to have the same number of fractional bits as multipliers, what do you do? You are dividing the product by 2 N (or shift it arithmetically N positions to the right). Plain.


A few words of caution, just in case ...

In C (and C ++), you cannot legally shift a variable left or right with the same or more bits contained in the variable. The code will compile, but doesn't work as you might expect. So, if you want to shift a 32-bit variable, you can shift it by 0 by 31 positions to the left or right (31 is the maximum, not 32).

If you are shifting integer integers you cannot end up finishing the result. All signed overflows result in undefined behavior. This way you can stick with unsigned.

The right shift of negative integers is implementation specific. They can either do an arithmetic shift or a logical shift. Which one depends on the compiler. So, if you want one of the two, you either need to make sure your compiler just supports it directly or implement it in other ways.

0


source







All Articles