Array with unique values: indexOf vs new Set
Suppose we have an array with repeating elements like:
var items = ['a', 'd', 'e', 'b', 'c', 'd', 'e', 'f', 'g', 'd', 'f', 'g', 'd', 'j', 'k', 'l', 'd', 'e', 'c', 'd', 'e', 'f', 'g', 'd','c', 'd', 'e', 'f', 'g', 'd'];
What a quick way to have a new array with values, given that you have to navigate through the elements (may not make sense, but this scenario).
Option 1:
var list = [];
items.forEach(function(item) {
if(list.indexOf(item) == -1)
list.push(item);
});
Option 2:
var list = [];
items.forEach(function(item) {
list.push(item);
});
list = Array.from(new Set(list));
Now I have done some testing with console.time and it shows that option 2 is 5x faster than option 1. But I'm not sure how trusted this console.time is. Any insight? Is indexOf
that what makes option 1 slower?
Fiddle: https://jsfiddle.net/q9opqvsm/
Edit: Another question: if option 2 is faster, I should change my code from option 1 => option 2. If not, why not?
source to share
"Is indexOf what makes option 1 slower?"
Yes. indexOf is O (n) meaning you have a loop inside your loop giving you a total O (n2). indexOf
is equivalent to doing something like this:
function indexOf(item, array) {
for (var i=0; i < array.length; i++) {
if (array[i] === item) {
return true;
}
}
return false;
}
As you can see, in the worst case (the element is not in the array yet) iterates over the entire array. Not. If you are looking for an array for a value, you must look at each element until you find it, or you run out of elements.
In option 2, finding a value in a set is O (1) and Array.from O (n), so you are O (n) in general.
Creating a set is somewhat equivalent to doing something like this (not actually it doesn't create a set, but an object, so it's not exactly the same):
function makeSet(array) {
var set = {};
for (var i=0; i < array.length; i++) {
if (set[array[i]] === undefined) { // indexing `set` is O(1)
set[array[i]] = true;
}
}
}
So that O(n)
's the whole thing . And creating an array from that is just a case of iterating over a set and loading it into an array, which is also O(n)
. So he is O(n)
in general.
Another question: if option 1 is faster, should I change my code from option 1 => option 2. If not, why not?
Option 1 is not faster, but if we pretend then the answer is that it depends. Option 1 will certainly not scale like Option 2, but that doesn't mean Option 1 might not be faster for small enough arrays (although I doubt it). In any case, this is a premature optimization. If your code is slow and you go through your code and identify this part as a bottleneck, then you should worry about it.
Edit:
Little typo, I meant if option 2 is faster. No bottlenecks
The same arguments apply for premature optimization. But personally, I would probably change that. This seems like a fairly low impact and, if possible, has a clearer intent with option 2.
Although - consider browser support for Set
. This is relatively new and not supported by older browsers. See here .
source to share