Array with unique values: indexOf vs new Set

Suppose we have an array with repeating elements like:

var items = ['a', 'd', 'e', 'b', 'c', 'd', 'e', 'f', 'g', 'd', 'f', 'g', 'd', 'j', 'k', 'l', 'd', 'e', 'c', 'd', 'e', 'f', 'g', 'd','c', 'd', 'e', 'f', 'g', 'd'];

      

What a quick way to have a new array with values, given that you have to navigate through the elements (may not make sense, but this scenario).

Option 1:

var list = [];
items.forEach(function(item) {
    if(list.indexOf(item) == -1)
      list.push(item);
});

      

Option 2:

var list = [];
items.forEach(function(item) {
    list.push(item);
});

list = Array.from(new Set(list));

      

Now I have done some testing with console.time and it shows that option 2 is 5x faster than option 1. But I'm not sure how trusted this console.time is. Any insight? Is indexOf

that what makes option 1 slower?

Fiddle: https://jsfiddle.net/q9opqvsm/

Edit: Another question: if option 2 is faster, I should change my code from option 1 => option 2. If not, why not?

+3


source to share


1 answer


"Is indexOf what makes option 1 slower?"

Yes. indexOf is O (n) meaning you have a loop inside your loop giving you a total O (n2). indexOf

is equivalent to doing something like this:

function indexOf(item, array) {
    for (var i=0; i < array.length; i++) {
        if (array[i] === item) {
            return true;
        }
    }
    return false;
}

      

As you can see, in the worst case (the element is not in the array yet) iterates over the entire array. Not. If you are looking for an array for a value, you must look at each element until you find it, or you run out of elements.

In option 2, finding a value in a set is O (1) and Array.from O (n), so you are O (n) in general.

Creating a set is somewhat equivalent to doing something like this (not actually it doesn't create a set, but an object, so it's not exactly the same):

function makeSet(array) {
    var set = {};
    for (var i=0; i < array.length; i++) {
        if (set[array[i]] === undefined) {   // indexing `set` is O(1)
            set[array[i]] = true;
        }
    }
}

      



So that O(n)

's the whole thing . And creating an array from that is just a case of iterating over a set and loading it into an array, which is also O(n)

. So he is O(n)

in general.

Another question: if option 1 is faster, should I change my code from option 1 => option 2. If not, why not?

Option 1 is not faster, but if we pretend then the answer is that it depends. Option 1 will certainly not scale like Option 2, but that doesn't mean Option 1 might not be faster for small enough arrays (although I doubt it). In any case, this is a premature optimization. If your code is slow and you go through your code and identify this part as a bottleneck, then you should worry about it.

Edit:

Little typo, I meant if option 2 is faster. No bottlenecks

The same arguments apply for premature optimization. But personally, I would probably change that. This seems like a fairly low impact and, if possible, has a clearer intent with option 2.

Although - consider browser support for Set

. This is relatively new and not supported by older browsers. See here .

+2


source







All Articles