How does XOR really work, and what is its magic?
The question is maybe a bit misleading, but I didn't know how to ask it differently. There is a problem in hackerrank that goes like this:
Consider an array of integers where all but one of the integers occur in pairs. In other words, each element is entered exactly twice except for one unique element.
Given an array, find and print the unique element. [2,3,1,3,2] -> result is 1
I solved the problem as follows:
private static int lonelyInteger(int[] a) {
if(a==null)
return 0;
if(a.length<2)
return a.length;
Set<Integer> set = new HashSet<>();
for(int i : a){
if(set.contains(i)){
set.remove(i);
}else{
set.add(i);
}
}
return (Integer) set.toArray()[0];
}
However, it turned out that there is a clear solution to this problem:
private static int lonelyInteger(int[] a) {
int b = a[0];
for(int i = 1; i < a.length; i++){
b ^= a[i];
}
return b;
}
The problem is I donβt know WHY DOES IT WORK ?! I understand HOW it works, but I don't understand WHY IT WORKS? To understand that I made a small program to output the results of each step:
public class BitwiseOperator {
public static void main(String[] args) {
Scanner in = new Scanner(System.in);
int n = in.nextInt();
int[] a = new int[n];
int sum = 0;
in.nextLine();
String line = in.nextLine();
String[] numbers = line.split(" ");
for (int i = 0; i < numbers.length; i++) {
a[i] = Integer.valueOf(numbers[i]);
}
for (int i = 0; i < a.length; i++) {
binary(sum, a[i]);
sum ^= a[i];
binary(sum);
System.out.println();
System.out.println();
System.out.println();
}
System.out.println(sum);
}
private static void binary(int sum) {
System.out.println(String.format("%16s", Integer.toBinaryString(sum)).replace(' ', '0') + " ->" + sum);
}
private static void binary(int sum, int i) {
System.out.println(String.format("%16s", Integer.toBinaryString(sum)).replace(' ', '0') + " ->" + sum);
System.out.println("XOR");
System.out.println(String.format("%16s", Integer.toBinaryString(i)).replace(' ', '0') + " ->" + i);
System.out.println("--------");
}
}
If you enter the following input:
five
2 3 2 1 3
Output:
0000000000000000 ->0
XOR
0000000000000010 ->2
--------
0000000000000010 ->2
0000000000000010 ->2
XOR
0000000000000011 ->3
--------
0000000000000001 ->1
0000000000000001 ->1
XOR
0000000000000010 ->2
--------
0000000000000011 ->3
0000000000000011 ->3
XOR
0000000000000001 ->1
--------
0000000000000010 ->2
0000000000000010 ->2
XOR
0000000000000011 ->3
--------
0000000000000001 ->1
1
So the program actually works, BUT I really need to understand WHY?
source to share
The exact proof IMHO includes group theory (you can build an abelian group based on xor
):
-
xor
- group operation -
0
- this group0
-
A
is inverse (so anyA
is inverse to itself).
Of course we need to prove that (A xor B) xor C == A xor (B xor C)
Since A xor B == B xor A
we have an abelian group and therefore we can rearrange elements in any order:
A XOR B XOR C XOR A XOR B ==
(A XOR A) XOR (B XOR B) XOR C ==
C
In general:
A xor B xor ... xor Z ==
(A xor A) xor (B xor B) xor ... xor (distinct Item) ==
0 xor 0 xor ... xor (distinct Item) ==
distinct Item
source to share
Consider an array of integers where all but one of the integers fall into pairs. In other words, each element in it happens exactly twice, except for one unique element. Now imagine adding all these numbers.
What does it mean if this amount is even? This means that the number that appears once must be even. What does it mean if this amount is odd? This means that the number that appears once must be odd. Think about it until you figure it out.
Now imagine that instead of summing them up, we just tracked if the sum was even or odd. So if the first two numbers were 3 and 7, the sum would be 10, but we just remember that it was even. It will still work. The final answer will be even if the number that appears once is even and odd, if the number that appears once is odd. Think about it until you figure it out.
So how can we make it work for one bit of numbers. But we could also do this for all bits at the same time, keeping track of each bit position, whether the total was odd or even. When this is done, we will have whether the number that appeared once was odd or even for each bit position, and that's all we need. Since we are doing this in binary, the only odd number for this position is 1, and only one is 0. Think about it until you understand it.
source to share