Func performance vs custom delegate
I am working on some very performance critical code and found that calling an anonymous method using a delegate performs worse than calling the same code through a Func delegate.
public class DelegateTests
{
public delegate int GetValueDelegate(string test);
private Func<string, int> getValueFunc;
private GetValueDelegate getValueDelegate;
public DelegateTests()
{
getValueDelegate = (s) => 42;
getValueFunc = (s) => 42;
}
[Benchmark]
public int CallWithDelegate()
{
return getValueDelegate.Invoke("TEST");
}
[Benchmark]
public int CallWithFunc()
{
return getValueFunc.Invoke("TEST");
}
}
BenchmarkDotNet
gives:
// * Summary *
BenchmarkDotNet=v0.10.4, OS=Windows 10.0.14393
Processor=Intel Core i7-4770HQ CPU 2.20GHz (Haswell), ProcessorCount=2
Frequency=10000000 Hz, Resolution=100.0000 ns, Timer=UNKNOWN
[Host] : Clr 4.0.30319.42000, 32bit LegacyJIT-v4.6.1637.0
RyuJitX64 : Clr 4.0.30319.42000, 64bit RyuJIT-v4.6.1637.0
Job=RyuJitX64 Jit=RyuJit Platform=X64
Method | Mean | Error | StdDev |
----------------- |----------:|----------:|----------:|
CallWithDelegate | 0.9926 ns | 0.0559 ns | 0.0783 ns |
CallWithFunc | 0.8763 ns | 0.0168 ns | 0.0131 ns |
// * Hints *
Outliers
DelegateTests.CallWithFunc: RyuJitX64 -> 3 outliers were removed
// * Legends *
Mean : Arithmetic mean of all measurements
Error : Half of 99.9% confidence interval
StdDev : Standard deviation of all measurements
// ***** BenchmarkRunner: End *****
As we can see, calling a function using a delegate Func
is faster than calling a function using GetValueDelegate
. I'm trying to find evidence of why he behaves this way. Looking at optimized JIT machine code
26: return getValueDelegate.Invoke("TEST");
00E105C0 8B 49 08 mov ecx,dword ptr [ecx+8]
00E105C3 8B 15 C4 22 71 03 mov edx,dword ptr ds:[37122C4h]
00E105C9 8B 41 0C mov eax,dword ptr [ecx+0Ch]
00E105CC 8B 49 04 mov ecx,dword ptr [ecx+4]
00E105CF FF D0 call eax
00E105D1 C3 ret
compared with
32: return getValueFunc.Invoke("TEST");
00E10608 8B 49 04 mov ecx,dword ptr [ecx+4]
00E1060B 8B 15 C4 22 71 03 mov edx,dword ptr ds:[37122C4h]
00E10611 8B 41 0C mov eax,dword ptr [ecx+0Ch]
00E10614 8B 49 04 mov ecx,dword ptr [ecx+4]
00E10617 FF D0 call eax
00E10619 C3 ret
They look very similar. I am starting to think that there might be a difference in the Invoke method for the two delegates. They both derive from MulticastDelegate, which is a requirement for all delegates in the CLR. Why is one faster than the other?
UPDATE
Here are the numbers using LegacyJitx86. Please note that I am simply interested in WHY there is a difference. By the way, changing the sequence or order of variables does not affect the result
// * Summary *
BenchmarkDotNet=v0.10.4, OS=Windows 10.0.14393
Processor=Intel Core i7-4770HQ CPU 2.20GHz (Haswell), ProcessorCount=2
Frequency=10000000 Hz, Resolution=100.0000 ns, Timer=UNKNOWN
[Host] : Clr 4.0.30319.42000, 32bit LegacyJIT-v4.6.1637.0
LegacyJitX86 : Clr 4.0.30319.42000, 32bit LegacyJIT-v4.6.1637.0
Job=LegacyJitX86 Jit=LegacyJit Platform=X86
Runtime=Clr
Method | Mean | Error | StdDev |
----------------- |----------:|----------:|----------:|
CallWithDelegate | 2.3385 ns | 0.0361 ns | 0.0320 ns |
CallWithFunc | 2.0144 ns | 0.0410 ns | 0.0384 ns |
// * Hints *
Outliers
DelegateTests.CallWithDelegate: LegacyJitX86 -> 1 outlier was removed
// * Legends *
Mean : Arithmetic mean of all measurements
Error : Half of 99.9% confidence interval
StdDev : Standard deviation of all measurements
// ***** BenchmarkRunner: End *****
source to share
No one has answered this question yet
Check out similar questions: