ConcurrentHashMap.keySet () problem. RemoveAll ()

I am reading about ConcurrentHashMap

and checking the removeAll () implementation of my keyset. In the current implementation, JAVA iterates over the entire keyset data structure, even if the collection contains only one or no elements.

Actual implementation

public final boolean removeAll(Collection<?> c) { 
        if (c == null) throw new NullPointerException(); 
        boolean modified = false; 
        for (Iterator<E> it = iterator(); it.hasNext();) { 
            if (c.contains(it.next())) { 
                it.remove(); 
                modified = true; 
            } 
        } 
        return modified; 
    } 

      

Can someone tell me if this is intended by the JAVA developers or I am just thinking about it

+3


source to share


3 answers


Posting the expected implementation , but to be very frank, I am not the actual author of the code below, found this on a coding blog while reading about parallel hashmap so would like to share with everyone else.



public boolean removeAll(Collection<?> c) { 
        boolean modified = false; 

        if (size() > c.size()) { 
            for (Iterator<?> i = c.iterator(); i.hasNext(); ) 
                modified |= remove(i.next()); 
        } else { 
            for (Iterator<?> i = iterator(); i.hasNext(); ) { 
                if (c.contains(i.next())) { 
                    i.remove(); 
                    modified = true; 
                } 
            } 
        } 
        return modified; 
    }

      

+2


source


KeySet

in regular Map

(eg HashMap

) implements AbstractSet

. For OpenJDK 8, the source code shows this as a method removeAll

:

public boolean removeAll(Collection<?> c) {
    Objects.requireNonNull(c);
    boolean modified = false;

    if (size() > c.size()) {
        for (Iterator<?> i = c.iterator(); i.hasNext(); )
            modified |= remove(i.next());
    } else {
        for (Iterator<?> i = iterator(); i.hasNext(); ) {
            if (c.contains(i.next())) {
                i.remove();
                modified = true;
            }
        }
    }
    return modified;
}

      

As you can see, there is a check that has the most entries in the collection. If itself Set

is larger, then iterate over the given collection argument. Otherwise, it is done by elements itself Set

. Therefore, if there are no entries in the collection you are traversing, the actual loop is executed zero. The same if he himself Set

has no records.

However, for a ConcurrentHashMap

method it keySet()

returns an instance of an inner class KeySetView

that implements another inner class CollectionView

. The implementation removeAll

of what gets done is in line with the code you posted, which always iterates over the elements KeySetView

, not the given collections.



The reason for this is the likelihood that what Iterator

is returned by the views (keyset or recordset) allows concurrent access, reflecting the values ​​that are present at the time of the iterator query. From the Javadoc ConcurrentHashMap

:

Likewise, iterators and enumerations return items that reflect the state of the hash table at some point in time or after the iterator / enumeration was created. They don't throw ConcurrentModificationException. However, iterators are only meant to be used one thread at a time.

Thus, the method forces the use of the key view Iterator

itself to take care of consistency between concurrent activities on the view or map.

Note, however, that the above implementation for is AbstractSet

also not optimal. If the collection c

you supply as an argument is larger than the set, it c.contains(element)

is called for each element

in the set, but depending on the type of the collection, the method contains

may not be as efficient, For example, for ArrayList

contains

is executed in linear time, whereas having an object in the set will be detected at a constant time.

+2


source


Collection versus set

The implementation in question belongs to the class AbstractCollection

, and the implementation in the solution is from AbstractSet

, which has the form AbstractCollection

.

The performance gain is due to the fact that Collection

you cannot guarantee the uniqueness of the elements, so the call is size()

not enough to optimize the deletion. In Set

, however, uniqueness is guaranteed, so additional assumptions and performance improvements can be made, such as iterating with a smaller set when determining which elements to remove.

Highly specialized card and its sets

The story, however, is completely different in the sets and values ConcurrentHashMap

. This is because it ConcurrentHashMap

is highly specialized Map

with many assumptions and improvements for concurrency. This makes any general performance improvements such as in Set.removeAll()

, invalidated anymore (in light of the actual implementation).

ConcurrentHashMap

The iterator is loosely consistent and does a lot of magic behind the scenes. Possibly removeAll

on a keychain is the price paid for all other performance metrics (in terms of a concurrent access point).

Just follow iterator().remove()

on CocnurrentHashMap

down the rabbit hole ConcurrentHashMap.replaceNode()

to see how much logic lingers in the replaceNode () source code to place items from the iterator.

+2


source







All Articles