C ++ How to determine if a function has the ability to be inline and in fact?

I have a question about functions inline

in C ++. I know that similar questions have been raised many times about this. I hope mine is a little different.

I know that when you specify a function as inline

, it is just a "suggestion" for the compiler. So in case:

inline int func1()
{
    return 2;
}

      

The following code

cout << func1() << endl; // replaced by cout << 2 << endl;

      

So, there is no mystery, but what about cases like this:

inline int func1()
{
    return 2;
}

inline int func2()
{
    return func1() * 2;
}

inline int func3()
{
    return func2() * func1() * 2;
}

      

Etc...

Which of these functions can become inline, is it useful, and how to check what the compiler actually did?

+3


source to share


2 answers


Which of these functions has a chance to become inline

Any and all functions have a chance to become inline if the inlining tool (1) has access to the function definition (= body) ...

is it useful

... and finds it helpful. Currently, the job of the optimizer is to determine where inline makes sense, and for 99.9% of programs, it is best for the programmer not to do the optimizer. The other few cases are programs like Facebook, where 0.3% performance loss is a huge regression. In such cases, manually tweaking the optimizations (along with profiling, profiling, and profiling ) is the way to go.

how to check what the compiler actually did



By checking the generated assembly. Each compiler has a flag to output the assembly in "human-readable" format instead of (or in addition to) the object files in binary form.


(1) Typically this tool is a compiler and nesting occurs as part of the compilation step (turning source code into assembly / object files). This is also the only reason you might need to use the keyword inline

to actually allow the compiler to inline: because the function definition needs to be visible in the translation unit (= source file), and this often means that this means putting the function definition in the header file. Without inline

it, it will lead to errors with multiple definitions if the header file was included in multiple translation units.

Note that compilation is not the only step where insertion is possible. When you enable whole program optimization (also known as Generation Code Time Time Generation), another optimization transition occurs at join time, once all object files have been created. At the moment the key wordinline

is completely irrelevant as the binding has access to all function definitions (the binary will not successfully reference otherwise). It is thus a way to get the most out of inlining without having to think about it at all when writing code. The disadvantage is time: WPO takes time to launch and for large projects, can extend the connection time to unacceptable levels (I personally experienced a somewhat pathological case where WPO took from 7 minutes to 46 minutes to connect with the program).

+4


source


Think of it inline

as a little bit like a compiler hint register

in older versions of the C ++ and C standards. Caveat is register

deprecated (in C ++ 17).

Which of these functions has a chance of being built-in, is it beneficial

Trust your compiler when making the right decisions. To enable some specific invocation appearance, the compiler needs to know the body of the called function. It doesn't matter to you whether the compiler is inlay or not (in theory).

In practice with GCC the compiler:



  • inlining does not always improve performance (e.g. due to CPU cache , TLB , predictor branch , etc., etc.).

  • nested solutions are highly dependent on optimization parameters . This is more likely to happen with -O3

    than with -O1

    ; there are many guru options (like -finline-limit=

    others) to customize it.

  • please note whether the individual calls are inserted or not. It is possible that some call occurrence, such as foo(x)

    on line 123, is nested, but another call occurrence (to the same function foo

    ) as foo(y)

    elsewhere, such as line 456, is not nested.

  • when debugging, you can disable the attachment (as this makes debugging more convenient). This is possible with the -fno-inline

    GCC optimization flag (which I often use with -g

    which asks for debug information).

  • always_inline

    the function attribute "forces" inlining, but noinline

    prevents it.

  • if you compile and link with connection time optimization (LTO) like -flto -O2

    (or -flto -O3

    ) for example. with CXX=g++ -flto -O2

    in yours Makefile

    , nesting can be done between multiple translation units (e.g. C ++ source files). However, LTO at least doubles compile time (and often worse) and consumes memory at compile time (therefore, it is better to have a lot of RAM) and often only improves performance by a few percent (with unusual exceptions to this rule of thumb).

  • you can optimize the function in different ways, eg. with #pragma GCC optimize ("-O3")

    or with function attribute optimize

  • Look also at profile-based optimization with tooling options such as -fprofile-generate

    recent optimizations -fprofile-use

    with other optimization flags .

If you are wondering what calls are inline (and sometimes some of them won't), look at the generated assembler (for example, use g++ -O2 -S -fverbose-asm

and look in an assembler file .s

) or use some of the internal dump options .

The observable behavior of your code (other than performance) should not depend on how you write the decisions made by your compiler. In other words, don't expect the investment to happen (or not). If your code behaves differently with or without optimizations, it might be a bug. So read undefined behavior .

See also MILEPOST GCC Project (Using Machine Learning Techniques for Optimization Purposes).

+1


source







All Articles