What is calling this GetHashCode implementation 20x slower than the .net implementation?
I had a substring idea struct
from this post and this one. The second post has a .net String.GetHashCode () implementation. (I'm not sure which .net version it is from.)
Here is the implementation. (GetHashCode comes from the second source above.)
public struct Substring
{
private string String;
private int Offset;
public int Length { get; private set; }
public char this[int index] { get { return String[Offset + index]; } }
public Substring(string str, int offset, int len) : this()
{
String = str;
Offset = offset;
Length = len;
}
/// <summary>
/// See http://www.dotnetperls.com/gethashcode
/// </summary>
/// <returns></returns>
public unsafe override int GetHashCode()
{
fixed (char* str = String + Offset)
{
char* chPtr = str;
int num = 352654597;
int num2 = num;
int* numPtr = (int*)chPtr;
for (int i = Length; i > 0; i -= 4)
{
num = (((num << 5) + num) + (num >> 27)) ^ numPtr[0];
if (i <= 2)
{
break;
}
num2 = (((num2 << 5) + num2) + (num2 >> 27)) ^ numPtr[1];
numPtr += 2;
}
return (num + (num2 * 1566083941));
}
}
}
Here's a unit test:
[Test]
public void GetHashCode_IsAsFastAsString()
{
var s = "The quick brown fox";
var sub = new Substring(s, 1, 5);
var t = "quick";
var sum = 0;
sum += sub.GetHashCode(); // make sure GetHashCode is jitted
var count = 100000000;
var sw = Stopwatch.StartNew();
for (var i = 0; i < count; ++i)
sum += t.GetHashCode();
var t1 = sw.Elapsed;
sw = Stopwatch.StartNew();
for (var i = 0; i < count; ++i)
sum += sub.GetHashCode();
var t2 = sw.Elapsed;
Debug.WriteLine(sum.ToString()); // make sure we use the return value
var m1 = t1.Milliseconds;
var m2 = t2.Milliseconds;
Assert.IsTrue(m2 <= m1); // fat chance
}
The problem is that m1 is 10 milliseconds and m2 is 190 milliseconds. (Note: this is 1,000,000 iterations.) FYI, I ran this on a .net 4.5 64-bit Release build with optimizations enabled.
+3
source to share
1 answer
- Confirmed by the comment, I double checked to make sure the optimized code is running. It turns out that the hidden debugger option turns off optimizations. So I am unchecked "Tools" - "Options" - "Debug" - "General" - disable JIT optimization when loading a module (only for managed). This caused the optimized code to load correctly.
- Even with optimization enabled, there is still a 3x to 6x difference. However, it might be because the above code is a 32-bit version of .net and I am running a 64-bit .net network. Porting a 64-bit implementation of string.GetHashCode to a substring is not easy because it relies on the null end of the string marker (which is actually an error ).
I'm currently disappointed that I don't get parity performance, but it was a great use of my time in learning about some of the pitfalls and pitfalls of C # optimization.
0
source to share