Reflection Renders HashCode Unstable

In the following code, accessing custom attributes SomeClass

causes the hash function to become volatile SomeAttribute

. What's happening?

static void Main(string[] args)
{
    typeof(SomeClass).GetCustomAttributes(false);//without this line, GetHashCode behaves as expected

    SomeAttribute tt = new SomeAttribute();
    Console.WriteLine(tt.GetHashCode());//Prints 1234567
    Console.WriteLine(tt.GetHashCode());//Prints 0
    Console.WriteLine(tt.GetHashCode());//Prints 0
}


[SomeAttribute(field2 = 1)]
class SomeClass
{
}

class SomeAttribute : System.Attribute
{
    uint field1=1234567;
    public uint field2;            
}

      

Update:

This is reported by MS as a bug. https://connect.microsoft.com/VisualStudio/feedback/details/3130763/attibute-gethashcode-unstable-if-reflection-has-been-used

+3


source to share


2 answers


It's really hard. First, let's take a look at the source code for the method Attribute.GetHashCode

:

public override int GetHashCode()
{
    Type type = GetType();

    FieldInfo[] fields = type.GetFields(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic);
    Object vThis = null;

    for (int i = 0; i < fields.Length; i++)
    {
        // Visibility check and consistency check are not necessary.
        Object fieldValue = ((RtFieldInfo)fields[i]).UnsafeGetValue(this);

        // The hashcode of an array ignores the contents of the array, so it can produce 
        // different hashcodes for arrays with the same contents.
        // Since we do deep comparisons of arrays in Equals(), this means Equals and GetHashCode will
        // be inconsistent for arrays. Therefore, we ignore hashes of arrays.
        if (fieldValue != null && !fieldValue.GetType().IsArray)
            vThis = fieldValue;

        if (vThis != null)
            break;
    }

    if (vThis != null)
        return vThis.GetHashCode();

    return type.GetHashCode();
}

      

In a nutshell, what it does:

  • List the fields of your attribute
  • Find the first field that is not an array and does not have a null value
  • Return the hash code of this field

We can draw two conclusions at this point:

  • Only one field is taken into account for calculating the hash code of the attribute
  • The algorithm relies mainly on the order of the fields returned Type.GetFields

    (since we're taking the first field that matches the conditions)

Next, we see that the order of the fields returned Type.GetFields

changes between the two versions of the code:

typeof(SomeClass).GetCustomAttributes(false);//without this line, GetHashCode behaves as expected
SomeAttribute tt = new SomeAttribute();
Console.WriteLine(tt.GetHashCode());//Prints 1234567
Console.WriteLine(tt.GetHashCode());//Prints 0
Console.WriteLine(tt.GetHashCode());//Prints 0

foreach (var field in new SomeAttribute().GetType().GetFields(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic))
{
    Console.WriteLine(field.Name);
}

      

If the first line is uncommented, the code is displayed:

field2

field1

If the line is commented out, the code displays:

field1

field2

So this confirms that something is changing the order of the fields, thus creating different results for the function GetHashCode

.

The following is even more interesting:

typeof(SomeClass).GetCustomAttributes(false);//without this line, GetHashCode behaves as expected
SomeAttribute tt = new SomeAttribute();
foreach (var field in new SomeAttribute().GetType().GetFields(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic))
{
    Console.WriteLine(field.Name);
}

Console.WriteLine(tt.GetHashCode());//Prints 0
Console.WriteLine(tt.GetHashCode());//Prints 0
Console.WriteLine(tt.GetHashCode());//Prints 0

foreach (var field in new SomeAttribute().GetType().GetFields(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic))
{
    Console.WriteLine(field.Name);
}

      

This code displays:



field1

field2

0

0

0

field2

field1

The question remains: why does the order of the fields change after the first call GetFields

? I believe it has something to do with the internal cache in the instance Type

.

We can check the cache value by running it in the quick view:

System.Runtime.InteropServices.GCHandle.InternalGet (((System.RuntimeType) typeof (SomeAttribute)). M_cache) as RuntimeType.RuntimeTypeCache

At the very beginning of execution, the cache is empty (obviously). Then we execute:

typeof(SomeClass).GetCustomAttributes(false)

      

Following this line, if we check the cache, it contains one field: field2

. Now this is interesting. Why this field? Since you are using its attribute SomeClass

:[SomeAttribute(field2 = 1)]

Then we execute the first one GetHashCode

and check the cache, it now contains field2

, then field1

(remember order is important). Subsequent execution GetHashCode

will return 0 due to the order of the fields.

Now if we remove the line typeof(SomeClass).GetCustomAttributes(false)

and check the cache after the first one GetHashCode

, find field1

, then field2

.


Summarizing:

The Attribute hashcode algorithm uses the value of the first field found. Therefore, it largely depends on the order of the field returned by the method Type.GetFields

. This method uses the cache internally for performance purposes.

There are two scenarios:

  • A scenario you don't use typeof(SomeClass).GetCustomAttributes(false);

    Here, when called GetFields

    , the cache is empty. It will be populated with the attribute fields in order field1, field2

    . Then it GetHashCode

    will find field1

    as the first field and display 1234567

    .

  • The scenario you are using typeof(SomeClass).GetCustomAttributes(false);

    When performing this line will be executed constructor attributes: [SomeAttribute(field2 = 1)]

    . At this point, the metadata field2

    will be cached. Then you call GetHashCode

    and the cache is complete. field2

    already exists, so it won't be added again. Then added field1

    . So the order is in the cache field2, field1

    . Therefore, it GetHashCode

    will find it field2

    as the first field and display 0

    .

The only surprise is: why does the first call GetHashCode

behave differently than the next? I haven't tested, but believe it detects that the cache is incomplete and reads the fields differently. Then, for subsequent calls, the cache is complete and behaves sequentially.

To be honest, I think this is a mistake. Results GetHashCode

should be consistent over time. Therefore, the implementation Attribute.GetHashCode

should not rely on the order of the fields returned Type.GetFields

, as we saw, this can change. This should be reported to Microsoft.

+4


source


Great analysis of Kevin on this one. I think that the framework implementation should use all fields and attribute type to compute the hashcode and obviously generate the same hashcode every time. At the same time, there are two solutions here. I am not a professional at calculating / concatenating hash codes, so I use one for the tuple.

class SomeAttribute : System.Attribute
{
    uint field1 = 1234567;
    public uint field2;

    public override int GetHashCode()
    {
        return (GetType(), field1, field2).GetHashCode();
    }
}

      



Another solution if you want each instance to be unique (for use in a dictionary). Use GetHashCode for the object.

class SomeAttribute : System.Attribute
{
    private object FixHashCodeBug = new Object();

    public override int GetHashCode()
    {
        return FixHashCodeBug.GetHashCode();
    }
}

      

0


source







All Articles