How do I find the initial value in a for loop in an assembly?

I'm having a hard time figuring out what the assembly code below is doing when I convert it to C. I know it's a loop, but I don't know where to start converting it.

I understand that the input must be 6 numbers, and inside the loop it will add 5 and compare.

I'm mostly obsessed with how do we know the initial value?

   0x0000000000400f15 <+9>:     callq  0x4016e5 <read_six_numbers>
   0x0000000000400f1a <+14>:    lea    0x4(%rsp),%rbx
   0x0000000000400f1f <+19>:    lea    0x18(%rsp),%rbp
   0x0000000000400f24 <+24>:    mov    -0x4(%rbx),%eax
   0x0000000000400f27 <+27>:    add    $0x5,%eax
   0x0000000000400f2a <+30>:    cmp    %eax,(%rbx)
   0x0000000000400f2c <+32>:    je     0x400f33 <phase_2+39>
   0x0000000000400f2e <+34>:    callq  0x4016c3 <explode_bomb>
   0x0000000000400f33 <+39>:    add    $0x4,%rbx
   0x0000000000400f37 <+43>:    cmp    %rbp,%rbx
   0x0000000000400f3a <+46>:    jne    0x400f24 <phase_2+24>

      

+3


source to share


2 answers


The function read_six_numbers

gets the address of the array where to store the numbers in the register %rsi

. %rsi

is set to a position at the bottom of the stack ( %rsp

) where some space was allocated with sub $0x28,%rsp

. The loop in 0x400f24

uses a register %rbx

as a pointer that points to the array starting from the beginning. It checks if the previous value + 5 matches the current one. If not, it calls explode_bomb()

with no arguments. The loop repeats 5 times until the pointer points to the end of the array.



+5


source


There are a few things here that are not listed in your question (ABI, processor architecture, executable file format, etc.). Not all of them are necessary to answer your question, but understanding this will likely improve your overall understanding of how functions, methods, or procedures are called in a wide range of executable contexts.

ABI

In particular, different processor architectures, operating systems, and even executable binary formats may have different signatures for processing program input. Since you are obviously using an AMD64 architecture processor, you can find this page on Wikipedia . In particular, it seems that you are using a "System V x86-64 ABI" based on some context in your snippet. (We'll do a full analysis of your snippet later).

Stacks

The C programming language has the concept of a stack, so while it matters in terms of your snippet, it is not a requirement for C programs, and it is likely that the portable version of your program might not use the stack at all. Indeed, while introductory compiler courses still tend to use the stack to transfer state between call frames, the stack is not commonly used in the SysV ABI on AMD64.

(This was much more common than doing it on x86, since the 32-bit architecture is limited to a register. The overhead of using registers to transfer state on such an architecture is likely to be higher, since it is likely that registers will have to be copied onto the stack to they could be reused, and because most likely additional function calls will be retained.)

Your snippet

In particular, SysV ABI uses %rdi

, %rsi

, %rdx

, %rcx

, %r8

, %r9

and %xmm0-7

in that order.

0x0000000000400f0c <+0>:     push   %rbp
0x0000000000400f0d <+1>:     push   %rbx

      

This saves the caller's stack frame by pushing the registers representing the stack stack to the top of the stack. %rbp

and %rbx

are registers that cause "callable calls", which means that the called function must retain its value, since the caller needs its values ​​to maintain its state.

0x0000000000400f0e <+2>:     sub    $0x28,%rsp

      

This allocates 40 bytes of stack space. Why 40 bytes? We have already pushed 16 bytes onto the stack, keeping %rbp

and %rbx

. We need an extra 24 bytes for our scratch space read_six_numbers

, so 16 + 24 == 40.

0x0000000000400f12 <+6>:     mov    %rsp,%rsi

      

This moves the base address of the stack to %rsi

. Now, since I'm assuming the SysV ABI, this means that the address is actually the second argument to the function we're going to call. The content of this space is undefined and will probably be random. This is the scratch space used read_six_numbers

.



0x0000000000400f15 <+9>:     callq  0x4016e5 <read_six_numbers>

      

This calls the function read_six_numbers

. Since our scratch space is the second argument (according to the SysV ABI), this means that our calling function has a value in %rdi

, which is passed to read_six_numbers

unchanged. If I had to guess, I would say this value answers your question, so we'll need to see the caller of this function phase_2

to gain further understanding.

0x0000000000400f1a <+14>:    lea    0x4(%rsp),%rbx

      

read_six_numbers

read 6 32-bit numbers for a total of 24 bytes. The starting number is in 0x0(%rsp)

, and lea

gives us the address of a specific value. So this gives us a pointer to the second value in the array and puts it in %rbx

.

0x0000000000400f1f <+19>:    lea    0x18(%rsp),%rbp

      

The first value in the array is at 0x0(%rsp)

, and the sixth is at 0x14(%rsp)

; 0x18(%rbp)

is the first justified address at the end of our array.

0x0000000000400f24 <+24>:    mov    -0x4(%rbx),%eax
0x0000000000400f27 <+27>:    add    $0x5,%eax
0x0000000000400f2a <+30>:    cmp    %eax,(%rbx)
0x0000000000400f2c <+32>:    je     0x400f33 <phase_2+39>
0x0000000000400f2e <+34>:    callq  0x4016c3 <explode_bomb>
0x0000000000400f33 <+39>:    add    $0x4,%rbx
0x0000000000400f37 <+43>:    cmp    %rbp,%rbx
0x0000000000400f3a <+46>:    jne    0x400f24 <phase_2+24>

      

User chqrlie has explained this loop well enough. If previous ( -0x4(%rbx)

) and current + 5 are equal, we continue the loop. Otherwise, we will call explode_bomb

. I would add that while chqrlie says it takes no arguments, there is no guarantee that it does not. We haven't actually touched %rdi

or %rsi

, so the context is still available to use it. To claim that explode_bomb

it takes no arguments, we'll need to see it parsed; this context does not prove that it takes no arguments.

However, in this context the actual values ​​are undefined compared. We're just iterating over memory here.

0x0000000000400f3c <+48>:    add    $0x28,%rsp
0x0000000000400f40 <+52>:    pop    %rbx
0x0000000000400f41 <+53>:    pop    %rbp

      

This restores the caller context (remember we are calling - keep the caller stack state at the beginning) and ...

0x0000000000400f42 <+54>:    retq

      

returns to the next caller's IP address.

Perhaps there is something here you did not know yet. Otherwise, just a long explanation to tell you what chqrlie has already done: the initial value of the loop is 4 bytes behind the array base filled read_six_numbers

.

+3


source







All Articles