Hamming distance between two binary strings not working

I found an interesting algorithm for calculating hamming distance on this site:

def hamming2(x,y):
    """Calculate the Hamming distance between two bit strings"""
    assert len(x) == len(y)
    count,z = 0,x^y
    while z:
        count += 1
        z &= z-1 # magic!
    return count

      

The thing is, this algorithm only works with bit strings, and I am trying to compare two strings that are binary, but they are in string format, for example

'100010'
'101000'

      

How can I get them to work with this algorithm?

+3


source to share


4 answers


Implement it:

def hamming2(s1, s2):
    """Calculate the Hamming distance between two bit strings"""
    assert len(s1) == len(s2)
    return sum(c1 != c2 for c1, c2 in zip(s1, s2))

      



And test it:

assert hamming2("1010", "1111") == 2
assert hamming2("1111", "0000") == 4
assert hamming2("1111", "1111") == 0

      

+14


source


If we stick with the original algorithm, we need to convert strings to integers in order to be able to use bitwise operators.

def hamming2(x_str, y_str):
    """Calculate the Hamming distance between two bit strings"""
    assert len(x_str) == len(y_str)
    x, y = int(x_str, 2), int(y_str, 2)  # '2' specifies we are reading a binary number
    count, z = 0, x ^ y
    while z:
        count += 1
        z &= z - 1  # magic!
    return count

      

Then we can call it like this:



print(hamming2('100010', '101000'))

      

While this algorithm is cool as new, the conversion to string probably negates any speed advantage it might have. @ Dlask's answer is posted much more concise.

+5


source


This is what I am using to calculate the Hamming distance.
It counts # differences between lines of equal length.

def hamdist(str1, str2):
    diffs = 0
    for ch1, ch2 in zip(str1, str2):
        if ch1 != ch2:
            diffs += 1
    return diffs

      

+3


source


I think it explains well The Hamming distance

def hammingDist(s1, s2):
    bytesS1=bytes(s1, encoding="ascii")
    bytesS2=bytes(s2, encoding="ascii")
    diff=0;
    for i in range(min(len(bytesS1),len(bytesS2))):
        if(bytesS1[i]^bytesS2[i]!=0):
            diff+=1;
    return(diff)

      

0


source







All Articles