How exactly does hashlib hashlists view input?

The Python 2.7 documentation says about hashlib hashes:

hash.update(arg)

    Update the hash object with the string arg. [...]

      

But I've seen people feed it with objects that are not strings, for example. buffers , numpy ndarrays .

Given the duck Python job, I'm not surprised you can specify non-string arguments.

The question is: how do you know if a hasher is doing the right thing with an argument?

I can't imagine a hashire naively doing shallow iteration of the argument, because that will probably fail with ndarrays with more than one dimension - if you do a shallow iteration, you'll end up with an ndarray with n-1 dimensions.

+3


source to share


1 answer


update

unpacks its arguments using the spec s#

. This means it can be a string, Unicode, or a buffer interface .

You can't define a buffer interface in pure Python, but C libraries like numpy can and do - allowing them to jump to hash.update

.



Things like multidimensional arrays work fine - at the C level, they are stored as a continuous series of bytes.

+2


source







All Articles