My asm code is writing garbage bytes when I use int 21h function

Question

My asm code is writing garbage bytes when I use int 21h function

I did a search on Stack Overflow and I didn't find anything similar to my problem. My problem is this: I have a code that opens a file and writes a message at the end. When I use int 21h to write to a file for the first time, it writes well if the file is empty, but if the file has content, the program adds a lot of garbage bytes (characters like 畂 or other Japanese or Chinese characters) to the end.

I have verified that the program does not write more bytes than the message length. Please help me. Here's my source code:

.model tiny
.code

main:
    call delta          
delta:
    pop bp              
    sub bp, offset delta


    mov ax, @code       ;Get the address of code segment and store it in ax
    mov ds, ax          ;Put that value in Data segment pointer.
                    ;Now, we can reference any data stored in the code segment
                    ;without fail.
;Subroutines
open:
    mov ax, 3D02H   ;Opens a file
    lea dx, [bp+filename];Filename
    int 21h         ;Call DOS interrupt
    mov handle, ax  ;Save the handle in variable

move_pointer_to_end:
    mov bx, handle
    mov  ax,4202h                 ; Move file pointer
    xor  cx,cx                    ; to end of file
    cwd                           ; xor dx,dx
    int  21h

write:
    mov ax, 4000H
    mov bx, handle
    lea dx, [bp+sign]
    mov cx, 16
    int 21H

exit:
    mov ah,4Ch          ;Terminate process
    mov al,0            ;Return code
    int 21h

datazone:
    handle dw ?
    filename db 'C:\A.txt', 0
    sign db 'Bush was here!!', 0

end main

Please help me!

+3

assembly file dos tasm character-encoding

jhonny6721 12 Aug 14 at 16:55

source to share

1 answer

JustKevin · Accepted Answer · 2014-08-12T17:30:34+0000

This is because the file you are adding data to is unicode encoded. If you write the file from Notepad or another text editor and save it, you need to choose ANSI as the encoding. Then, if you list your program in an ANSI encoded text file, it should add the line specified with the expected result.

Unicode allocates two bytes for each character, so in a hex editor you can see s.o.m.e.t.h.i.n.g. .l.i.k.e. .t.h.i.s.

, not something like this

what you might expect for ANSI or UTF-8.

My asm code is writing garbage bytes when I use int 21h function

More articles: