OS X - x64: stack not 16 byte aligned
I know OS X is 16 byte alignment, but I really don't understand why the error occurs here.
All I am doing here is passing the size of the object (which is 24) to% rdi and calling malloc. Does this error mean that I have to request 32 bytes?
And the error message:
libdyld.dylib`stack_not_16_byte_aligned_error: β 0x7fffc12da2fa <+0>: movdqa% xmm0, (% rsp) 0x7fffc12da2ff <+5>: int3
libdyld.dylib`_dyld_func_lookup: 0x7fffc12da300 <+0>: pushq% rbp 0x7fffc12da301 <+1>: movq% rsp,% rbp
Here is the code:
Object_copy:
pushq %rbp
movq %rbp, %rsp
subq $8, %rsp
movq %rdi, 8(%rsp) # save self address
movq obj_size(%rdi), %rax # get object size
imul $8, %rax
movq %rax, %rdi
callq _malloc <------------------- error in this call
# rsi old object address
# rax new object address
# rdi object size, mutiple of 8
# rcx temp reg
# copy object tag
movq 0(%rsi), %rcx
movq %rcx, 0(%rax)
# set rdx to counter, starting from 8
movq $8, %rdx
# add 8 to object size, since we are starting from 8
addq $8, %rdi
start_loop:
cmpq %rdx, %rdi
jle end_loop
movq (%rdx, %rsi, 1), %rcx
movq %rcx, (%rdx, %rax, 1)
addq $8, %rdx
jmp start_loop
end_loop:
leave
ret
Main_protoObj:
.quad 5 ; object tag
.quad 3 ; object size
.quad Main_dispatch_table ; dispatch table
_main:
leaq Main_protoObj(%rip), %rdi
callq Object_copy # copy main proto object
subq $8, %rsp # save the main object on the stack
movq %rax, 8(%rsp)
movq %rax, %rdi # set rdi point to SELF
callq Main_init
callq Main_main
addq $8, %rsp # restore stack
leaq _term_msg(%rip), %rax
callq _print_string
source to share
As you said, MacOS X is 16-byte aligned, which means that the machine expects every variable on the stack to start working on a byte that is a multiple of 16 of the current stack pointer.
When the stack is displaced, it means that we start trying to read variables from the middle of that 16-byte window and usually end up with a segmentation fault.
Before you call a subroutine in your code, you must make sure that your stack is correctly aligned; in this case, which means the base pointer register is divisible by 16.
subq $8, %rsp # stack is misaligned by 8 bytes
movq %rdi, 8(%rsp) #
movq obj_size(%rdi), %rax #
imul $8, %rax #
movq %rax, %rdi #
callq _malloc # stack is still misaligned when this is called
To fix this, you can subq
%rsp
do something like 16 instead of 8.
subq $16, %rsp # stack is still aligned
movq %rdi, 16(%rsp) #
... #
callq _malloc # stack is still aligned when this is called, good
source to share