Cog VM and indirect variable access

Does anyone know if the Cog VM for Pharo and Squeak can optimize the simplest indirect access variables with accessories like this:

SomeClass>>someProperty
    ^ someProperty
SomeClass>>someSecondProperty
    ^ someSecondProperty

      

which just return an instance variable, so methods like this:

SomeClass>>someMethod
    ^ self someProperty doWith: self someSecondProperty

      

will be no slower than methods like this:

SomeClass>>someMethod
    ^ someProperty doWith: someSecondProperty

      

I've done some tests and they seem to be roughly equivalent in speed, but I'm curious if anyone familiar with Cog knows for sure, because if there is a difference (however small) then there might be situations rarely where it is inappropriate.

+3


source to share


2 answers


There is a little standing there now, but it is so little that you shouldn't worry. If you want performance, you are willing to change other parts of your code, not instance variable access.

Quick bench: bench ^ {[iv yourself] bench. [self iv self] bench} => # ('52 400,000 per second. '49,800,000 per second. ') The difference doesn't look that big.

After a once and run difference, the difference is that "self iv" does an inline cache check, cpu call, and cpu return in addition to fetching the value of the instance variable. Call and return commands are likely to be expected by the processor and will not be executed. So it's about checking the inline cache, which is a very cheap operation.

What the built-in compiler will add in development is that the cpu call and return will indeed be removed by inlining, which will cover cases where the CPU was not expecting them. Also, the inline cache check may or may not be removed depending on the circumstances.



There are details such as a getter method that needs to be compiled into native code that takes up space in the native code area, which can increase the number of garbage collections in the native code area, but this is even more anecdotal than the built-in cache overhead.

So, in short, there is very little overhead right now, but overhead will decrease in the future.

Clement

+2


source


This is a difficult question ... And I don't know the exact answer. But I can help you learn how to test yourself with a few clues.

You will need to download the VMMaker package to the image. Pharo has a procedure to build such an image by simply downloading everything from the net and github. See https://github.com/pharo-project/pharo-vm

The main advice then is that methods that return an instance variable will compile as if the primitives 264 + inst var offset ... were executed (for example, you will see this by checking Interval>>#first

or any other simple inst var getter)

In the classic VM interpreter, this is handled in Interpreter>>internalExecuteNewMethod

.
It seems like you are paying the cost of finding a method (some caches make it cheaper) but not from actually activating the method.
I suppose he explains that debuggers cannot go into such simple methods ... This, however, is not a real inlay.

In COG, the same thing happens StackInterpreter>>internalQuickPrimitiveResponse

when using an interpreter.



As far as JIT is concerned, this is handled Cogit>>compilePrimitive

, see also constructors genQuickReturnInstVar

. This is also not a valid attachment, but you can see that very few instructions are being generated. Again, I'm pretty sure you're not paying the price for a search at all thanks to what's called polymorphic inline cache (PIC).

For the actual inlay, I found no clue after this quick look at the source code ...
I understand this will happen from the image side via a callback from the Sista VM, but this is work in progress and only my vague memory. Clement Bera writes a blog about it (sista chronicles at http://clementbera.wordpress.com )

If you're afraid to dig into the VMMaker source code, I invite you to vm-dev.lists.squeakfoundation.org. I'm sure Eliot Miranda or Clement will be happy to give you a more accurate answer.

EDIT

I forgot to tell you about the completion of the above events: I think there will be very little difference if you use inst directly. var. not a getter, but it shouldn't really be noticeable, and in all cases, your programming style should NOT be guided by such negligible optimizations.

+2


source







All Articles