How fair is a direct comparison of ARM and x86 CPU clock speeds?

Question

How fair is a direct comparison of ARM and x86 CPU clock speeds?

I was wondering if I had an ARM processor (1GHz) and an x86 processor (1GHz) and no other information about them, would there be a general statement that one of them would probably perform better when doing some arbitrary (complex) application.

I know that the ARM architecture (RISC) was specifically designed to support certain applications, whereas the x86 architecture is just a compilation of legacy compatibility and workarounds (no distortion). Naturally, it would be important to know a little more about the application, that is, what specific tasks (and therefore instructions) can be performed the most. Let's also postpone that there are different versions of these architectures.

So my question is, with these two unspecified 1GHz processors, is it possible to make an educated guess that it will perform better (i.e. will execute a general but complex application faster).

On a second note, if this is not possible (which is what I assume), what common parameters would at least be necessary for comparison - other than comparing application assembly code with the corresponding instruction sets of architectures?

² To keep it simple, let's assume they have both 32-bit architecture and no AMD or Intel features.

+3

performance comparison x86 arm

Jim McAdams 11 jul. 15 at 21:42

source to share

3 answers

Below are the comparisons to my 32 bit GCC benchmarks. More details on my website:

http://www.roylongbottom.org.uk/

They represent percentage operations per Hz processor clock for Android systems (native Intel Atom code), Raspberry Pi 2 A7 and Intel / AMD computers running Linux Ubuntu.

This is first from the tiny loop Whetstone benchmark where Intel and ARM might be very similar.

The next Linpack benchmark with L2 cache speed dependence, where later technologies show improvements.

Finally, my max MFLOPS benchmarks, with SSE instructions compiled for Intel and NEON for ARM, with Intel jumped ahead. Core i7 with AVX directive has demonstrated up to 1147 MFLOPS per MHz.

Recording results may vary depending on the compiler release version.

                       Whetstone                 Linpack      Max
                        Float Functions  Integer    Float    Float

    Cortex-A9              22      1.7      124       17       95
    Cortex-A15             18      1.7      102       47      241
    Qualcomm 800           27      1.5      146       33
    Atom Z3745             30      1.7      182       22

    Cortex-A7              27      0.9      126       13       86

    Atom N455              19      0.7       63       12      110
    Athlon 64              28      1.6      113       42
    Phenom II              28      1.6      136       49      500
    Core 2 Duo             31      1.6      238       41      600
    Core i7 4820K          31      1.8      224       65      630

+5

Roy longbottom Jul 12 15 at 11:02

source to share

No, it's not fair. Like comparing any two random vehicles for any given performance metric (top speed, lane lap times, mileage, comfort, safety, etc.), simply because they both have the same number of wheels. Not a very interesting comparison to make predictions simply based on wheel counts and vehicle brand names (not models).

0

old_timer Jul 12 15 at 12:36 am

source to share

Peter cordes · Accepted Answer · 2015-07-12T00:03:33+0000

Not really. An X86 with such low power can be a low-power design like Atom (esp. Pre-Silvermont) or an even more limited x86 design. Modern x86 desktop / laptop processors can execute LOTs per cycle (about 4 instructions per cycle, no mispredictions, data dependency, or execution port contention). Google if you want IPC (insns per cycle) numbers for desktop processors in real code.

One processor might be faster for some things, but slower for others, for tasks that emphasize different items in this list (which I just made up):

Vector integer and / or FP bandwidth.
unpredictable branches (data compression)
workload size (cache)
main memory bandwidth
bandwidth for caching
large-code-footprint (cache / fetch / branch-prediction cache size)
AES / CRC32C / SHA1 (supported by HW instructions in some processors).

How to really compare

View benchmark results gcc

from SPECint2000 or SPECint2006. The other tests that make up all SPECint suites are generally considered less useful these days. (The source is a discussion on the Realworldtech forum . And yes, the "Linus" he agrees with is Linus Torvalds of Linux fame.)

You cannot play with gcc, it has quite a large cache size relative to others, underlines the branch predictor, etc. Since SPEC does not include browser or GUI benchmarks, gcc is probably the closest thing to actually measuring smartphone performance.

How fair is a direct comparison of ARM and x86 CPU clock speeds?

How to really compare

More articles: