IEEE 754 Denormalized decimal conversion to one and a half point binary

Question

IEEE 754 Denormalized decimal conversion to one and a half point binary

I am trying to convert 0.0000211 to binary. Here's what I understand so far:

E = -bias + 1.bias = 15, E = -14

Bit sign and exponent = 0.

So, I have a:

0 00000 ????????

When using grayscale format, 1 character, 5 bits of exponent and 10 bits of bits.

My question is, how can I find the fraction of this denormalized number? What does E and offset mean in this context? Any help would be appreciated

Note. I need to do this manually for my final.

+3

assembly x86 denormalized

James schmiechen May 10 '17 at 12:06

source to share

2 answers

So instead of having your number convert 0.2 decimal to binary manually.

Starting with a program to give me some base 10 fractions is probably the best way to do this, the link I posted doesn't work which works for integers.

1/2 0.50000000
1/4 0.25000000
1/8 0.12500000
1/16 0.06250000
1/32 0.03125000
1/64 0.01562500
1/128 0.00781250
1/256 0.00390625

So:

0.2 - 0.5 no 
0.2 - 0.25 no
0.2 - 0.125 = 0.075
0.075 - 0.0625 = 0.0125
0.0125 - 0.03125 no
0.0125 - 0.015625 no
0.0125 - 0.00781250 = 0.0046875
0.0046875 - 0.00390625 = 0.00078125
0.00078125 - 0.001953125 no
0.00078125 - 0.0009765625 no
0.00078125 - 0.00048828125 yes

I know it cannot be represented exactly in binary, it is repeating a number, so the above told me:

0.0011001100110011...

Is the binary number 0.2 in base 10.

Now, to normalize this, I need 1.xxxx, so I shift left 3 and get

1.1001100110011 * 2^(-3)

IEEE 754 single precision format (mantissa and fraction are the same)

seeeeeeeemmmmmmmmmmmmmmmmmmmmmmm

A positive number, so the sign of s is zero

exponent 2 is equal to power e-127

so we add 127 offsets to -3 and get 124 0x7c

note, since 1.xxxx implies no reason to destroy the 1 that is deleted, we just put this faction.

0 01111100 10011001100110011001100
0011 1110 0100 1100 1100 1100 1100 1100
0x3E4CCCCC

Now I cheated and let the computer convert this for me and got:

0 01111100 10011001100110011001101
0x3E4CCCCD

and it makes sense because before we finish the end, we have 11001, which is the last bit chopped off, is greater than or equal to half of our base, so we round if we want to round, which makes it 1101. When we have the base ten to round, we need equal or half base, so 5 0.105 rounds to 0.11. so in binary 0.11001 rounds up to 0.1101.

so the halftone format looks like

seeeeemmmmmmmmmm

and the offset is 2 ^ (e-15)

so add 15 to -3, we get 12

s is 0, positive e is 12, and m is without the implied 1 bit so

0 01100 1001100110
0011 0010 0110 0110
0x3266

where it gets chopped off was 0, so it doesn't round up, assuming a rounded rounding mode ...

so this is the normalized version 0.2 in 16-bit IEEE floating point format.

Now, if you read wikipedia, which is good enough to figure it out, if when you normalize this to 1.xxxxx, you shift to the left (or right if it is greater than 1.xxxx, left if it is less than 1.xxxx, which in this case), some number of N bits to do this so that your number is 1.xxxx times 2 ^ (- N) as shown on the wikipedia page

Emin = 000012 − 011112 = −14

So N out of 14 is the worst case you can have, if you need to shift more than 14 bits, you cannot normalize that number. so they have a case for this shown on wikipedia, they call it abnormal, like denormal. you shift it 14 bits to the left, which implies 2 ^ -14, so you convert your binary number to 0.xxxxxxxxxx * 2 ^ -14, regardless of the first ten bits of xxxxx which is your mantissa / fraction. and the indicator in the coding is the special number 00000

so 0 00000 xxxxxxxxxx is the IEEE 754 half-dot binary encoding of denormality.

0

old_timer May 10 '17 at 1:40

source to share

Clayton mills · Accepted Answer · 2017-05-10T01:36:01+0000

The mantissa (bit OPs?) Of a half, float or double is normalized to remove leading zeros. This is usually done until there is a number, 1.0 <= number <2.0. But in this case the number is in the range below the normals (the exponent is 00000 as you already set. This means that the original number was less than the minimum normal 6.10352 × 10 ^ -5, i.e. when you try to offset so that the number 1.0 <= number <2.0, you have reached the minimum limit of exponents), in which case they are shifted 15 times i.e. multiplied by 2 ^ 15 and keep as many bits after point as possible (for half floats, that's 10 bits). This means they can store very small numbers, because for the sub-normal range, they have an implicit 0 in front of the mantissa when reconstructing the number, and they allow leading zeros in the mantissa.

So 0.0000211 = b'0.000000000000000101100001111111111100111 ...

2 ^ 15 * 0.0000211 = 0.6914048 = b'0.101100001111111111100111 ...

We keep 1011000011 because under the normal range it removes the implicit 0. (i.e. for 0.XXXXXXXXXX we only store Xs)

So in this case the mantissa (OPs? Bits) is 1011000011

sign   exp      mantissa
0      00000    1011000011

This can be checked with python using numpy and struct

>>> import numpy as np
>>> import struct
>>> a=struct.pack("H",int("0000001101010000",2))
>>> np.frombuffer(a, dtype =np.float16)[0]
2.116e-05

So for your finale ... At the very least, you need to learn how to convert decimal code less than 1.0 to binary and memorize a few rules. You seem to be getting the upper hand over calculating the exponent.

Take a look at ...

https://math.stackexchange.com/questions/1128204/how-to-convert-from-floating-point-binary-to-decimal-in-half-precision16-bits

One of the answers to this question contains the python code for the whole conversion. What can be useful for training.

IEEE 754 Denormalized decimal conversion to one and a half point binary

More articles: