IntVal3 TBYTE 1234 - invalid TBYTE variable declaration left unnoticed by the assembler

I am currently learning assembly programming by following Kip Irvine's book on x86 assembly programming.

In the book, the author claims that

MASM uses the TBYTE directive to declare packed BCD variables. Const initializers must be in hexadecimal because the assembler does not automatically translate decimal initializers to BCD. the following two examples demonstrate both valid and invalid ways of representing decimal -1234:

intVal TBYTE 80000000000000001234h ; valid
intVal TBYTE -1234 ; invalid 

      

The reason the second example is invalid is because MASM encodes the constant as a binary integer, not a BCD packed integer.

I understand that the MASM assembler cannot convert decimal integer to BCD. But I came up with the following code which compiled just fine (note that intVal3 TBYTE 1234

it is considered invalid, but it was compiled exactly the same as the actual code)

.386
.MODEL FLAT, STDCALL
.STACK 4096
ExitProcess PROTO, dwExitCode: DWORD

.DATA
intVal1 TBYTE 800000000000001234h
intVal2 TBYTE -1234h
intVal3 TBYTE -1234      ; compiled despite being invalid

.CODE 
    main PROC 


invoke ExitProcess, 0
main ENDP
END main

      

Why did the invalid code go unnoticed by the assembler? Is this a bug that cannot be detected by the assembler and requires vigilance on the part of the programmer?

=============== EDIT 1 =================

I checked the listing file suggested by @PaulH, here is a screenshot

enter image description here

Judging by the output in the listing file and from what @PaulH said, I came to the following conclusion (not entirely correct though):

a variable of type TBYTE will interpret simply storing the binary value of the arguments (be it 80000000000000001234h, -1234h, or -1234) into a variable. Since a variable of type TBYTE is intended to be used as a BCD integer, it is entirely up to the programmer to ensure that a variable of type TBYTE is used correctly.

+3


source to share


1 answer


At the heart of the notion of type TBYTE

is that it has the same width as the internal registers of the x87 FPU, which means that it can be used to spill the contents of one of those registers into memory without losing precision.

Typically, when you store a floating point value in memory, you represent it as a single- DWORD

precision ( 32-bit, ) or double-precision (64-bit, QWORD

) value . That's fine, except it loses precision. If you want to spill a temporary intermediate value during computation, you often cannot afford to lose precision by truncating the value because it will affect the final result.

The name TBYTE

means that values ​​of this type are 10 bytes wide - the same width as used internally for floating point values ​​at x87 . (By default, at least if you haven't decreased FPU precision.)

So, it TBYTE

really has nothing to do with Binary Coded Decimal (BCD). I have no idea what Kip Irwin is talking about. You could store the BCD value in TBYTE

, but you could also store the lower BCD value in QWORD

or DWORD

. As the name suggests, BCD is simply an encoding that allows decimal digits to be stored in binary form.

Reason for which

intVal3 TBYTE -1234

      

compiles assemblies because for assembler (MASM) all you've done is declare a 10-byte value initialized with a value -1234

. It implicitly expands -1234

to fill 10 bytes, resulting in a value 0xFFFFFFFFFFFFFFFFFB2E

as you can see in the hex dump. Same for -1234h

, except that h

means the value is interpreted as hexadecimal and not decimal.

Please note that this is basically the same as if you did

myValue QWORD -1234

      



because the assembler will expand -1234

to 8 bytes.

As Ped7g points out in a comment, the main thing to remember when programming in assembly is:

After all, it doesn't matter how you specify the contents of the memory in the source, ... the code that works with that memory determines its "value" (type).

The assembler just stores bytes. Since TBYTE

he keeps 10 of them. Since QWORD

he keeps 8 of them. Since DWORD

it stores 4 of them. You will get the picture. How your code interprets these bytes is up to you, because you need to write this code.


Peter Cordes points out (see comments) that the x87 FPU has instructions for loading and storing BCD values: FBLD

and FBSTP

. These can be used as a slow way to convert a binary integer to decimal numbers .

Both of these instructions take values m80bcd

as their only operand, which is an 80-bit BCD value that will be the same length as TYBTE

. So it is possible that Kip Irvine is talking about this usage for values TBYTE

.

However, I do not believe that MASM implicitly converts initializers TBYTE

to BCD format, as that would be very inconvenient if you used TBYTE

to store an extended precision floating point value as discussed above.With MASM or any other assembler you can still represent the value assigned TBYTE

respectively BCD or floating point, whichever you wanted.

And in general, now that you've heard of FBLD

and FBSTP

, you can pretty much forget about them again. I don't think they were ever used very much, and they certainly aren't used now. Even on older processors, such as the original Pentium (P5) and Pentium II (P6), these instructions took about 150 clock cycles. On newer processors, they got even slower (Skylake has 1 bandwidth FBSTP

in 266 cycles). So even if you really want to work with 80-bit BCD values, you'd better write out the instructions yourself. (Post a new question about this if you need help.)

+5


source







All Articles