Assembler on 64-bit iOS (A64)

Question

Assembler on 64-bit iOS (A64)

I am trying to replace certain methods with asm implementations. Target is arm64 on iOS (iPhone 5S or newer). I want to use a dedicated assembler file as inline assembler comes with additional overhead and is quite cumbersome to use with A64 memory offsets .

There isn't a lot of documentation on the internet, so I'm not sure how much I can get it done. Therefore, I will describe the process I followed to move the function to ASM.

The candidate function for this question is a 256 bit integer comparison function.

UInt256.h

@import Foundation;

typedef struct {
    uint64_t value[4];
} UInt256;

bool eq256(const UInt256 *lhs, const UInt256 *rhs);

Bridging-Header.h

#import "UInt256.h"

Reference implementation (Swift)

let result = x.value.0 == y.value.0
          && x.value.1 == y.value.1
          && x.value.2 == y.value.2
          && x.value.3 == y.value.3

UInt256.s

.globl _eq256
.align 2
_eq256:
    ldp        x9, x10, [x0]
    ldp       x11, x12, [x1]
    cmp        x9, x11
    ccmp      x10, x12, 0, eq
    ldp        x9, x10, [x0, 16]
    ldp       x11, x12, [x1, 16]
    ccmp       x9, x11, 0, eq
    ccmp      x10, x12, 0, eq
    cset       x0, eq
    ret

Resources I Found

In section 5.1.1 "Standard Call Call" for ARM 64-bit Architecture (AArch64), the document explains the purpose of each register during procedure calls.
iOS deviations .
iOS Assembler Directives .

Questions

I tested the code with XCTest, generating two random numbers, running Swift and Asm implementations on them, and checking that both report the same result. The code seems to be correct.

In the asm file: .align

seems to be for optimization - is it really necessary and if so what is the correct value for alignment?
Is there a source that clearly explains how the calling convention is for my particular function signature?

and. How can I know that inputs are actually being passed through x0

and x1

?

b. How can I know what is the correct way to pass the output to x0

?

from. How can I know what clobber x9

is safe - x12

and status registers?

Is the function named the same when I call it from C instead of Swift?
What does "indirect result location register" mean to describe a register r8

in an ARM document?
Do I need any other assembly directives besides .globl

?
When I set breakpoints, the debugger seems to get confused where it really is, showing the wrong lines, etc. Am I doing something wrong?

+3

assembly ios calling-convention arm64 swift

Etan June 19 15 at 21:19

source to share

1 answer

Ross Ridge · Accepted Answer · 2015-06-20T00:59:26+0000

A directive is required for the program to be correct .align 2

. A64 instructions must be aligned on 32-bit boundaries.
The documentation you provided seems clear to me and unfortunately this is not the place to ask for recommendations.
- You can determine what registers are lhs
  
  both rhs
  
  stored in X0
  
  and X1
  
  by following the instructions in section 5.4.2 (Parameter Passing Rules) of the ARM 64 Procedure Calling Standard (AArch64) that you linked. Since parameters are pointers, the only applicable rule is C.7.
- You can determine which register is used to return values by following the instructions in Section 5.5 (Returning Results). It's just that you follow the same rules as for the parameters. Since the function returns an integer, only rule C.7 applies, so the value is returned at X0.
- It is safe to change the values stored in registers X9 through X12 because they are listed as temporary registers in the table in Section 5.1.1 (General Purpose Registers).
- The question is whether a function is actually called in the same way in Swift as in C. Both the standard procedure call documents and the Apple-specific exception document are defined in C and C ++ terms. Presumably Swift follows the same conventions, but I don't know if Apple has made this explicit.
Objective R8 is described in Section 5.5 (Returning a Result). It is used when the return value is too large to fit into the registers used to return values. In this case, the caller creates a buffer for the return value and places it in R8. The function then copies the return value into this register.
I don't believe you will need anything else in your example build program.
You have asked too many questions. You should post a separate and more detailed question describing your problem.

I have to say that one of the benefits of writing code with inline assembly is that you don't have to worry about it. Something like the following untested C code shouldn't be too cumbersome:

bool eq256(const UInt256 *lhs, const UInt256 *rhs) {
     const __int128 *lv = (__int128 const *) lhs->value;
     const __int128 *rv = (__int128 const *) rhs->value;

     uint64_t l1, l2, r1, r2, ret;

     asm("ldp       %1, %2, %5\n\t"
         "ldp       %3, %4, %6\n\t"
         "cmp       %1, %3\n\t"
         "ccmp      %2, %4, 0, eq\n\t"
         "ldp       %1, %2, %7\n\t"
         "ldp       %3, %4, %8\r\n"
         "ccmp      %1, %3, 0, eq\n\t"
         "ccmp      %2, %4, 0, eq\n\t"
         "cset      %0, eq\n\t",
         : "=r" (ret), "=r" (l1), "=r" (l2), "=r" (r1), "=r" (r2)
         : "Ump" (lv[0]), "Ump" (rv[0]), "Ump" (lv[1]), "Ump" (rv[1])
         : "cc")

     return ret;
}

Ok, maybe this is a little cumbersome.

Assembler on 64-bit iOS (A64)

More articles: