Java 8 ConcurrentHashMap merge vs computeIfAbsent
I'm working on exercises from the book "Java SE 8 for the Really Impatient" by Kay S. Horstmann. There are 2 exercises that ask different implementations of the same algorithm, one uses the merge
other computeIfAbsent
. I have implemented the program using merge
but cannot figure out how to use computeIfAbsent
to do the same. It seems to me that computeIfPresent
would be a better fit because it merge
only works if the key is present as well computeIfPresent
.
Description of the problem:
Write an application in which multiple threads will read all words from a collection of files. Use
ConcurrentHashMap<String, Set<File>>
for a track that contains each word. Use the methodmerge
to update the map.
My code using merge
:
public static Map<String, Set<File>> reverseIndexUsingMerge(final Path path)
throws IOException {
final ConcurrentHashMap<String, Set<File>> map = new ConcurrentHashMap<>();
final BiConsumer<? super String, ? super Set<File>> action = (key,
value) -> map.merge(key, value, (existingValue, newValue) -> {
LOGGER.info("Received key: {}, existing value: {}, new value: {}.",
key, existingValue, newValue);
newValue.addAll(existingValue);
return newValue;
});
commonPool().invokeAll(
find(path, 1,
(p, fileAttributes) -> fileAttributes.isRegularFile())
.map(p -> new ReverseIndex(p, action))
.collect(toList()));
return unmodifiableMap(map);
}
private static class ReverseIndex implements Callable<Void> {
private final Path p;
private final BiConsumer<? super String, ? super Set<File>> action;
private static final Pattern AROUND_WHITESPACE = compile("\\s");
private ReverseIndex(final Path p,
final BiConsumer<? super String, ? super Set<File>> action) {
this.p = p;
this.action = action;
}
@Override
public Void call() throws Exception {
reverseIndex().forEach(action);
return null;
}
private Map<String, Set<File>> reverseIndex() {
/* File stream needs to be closed. */
try (Stream<String> lines = lines(p, UTF_8)) {
return lines.flatMap(AROUND_WHITESPACE::splitAsStream)
.collect(
groupingBy(String::toString,
mapping(word -> p.toFile(), toSet())));
} catch (IOException e) {
LOGGER.error("Something went wrong. Get the hell outta here.",
e);
throw new UncheckedIOException(e);
}
}
}
source to share
Focus on what to do if the value is missing. What you need to do is create a new value Set
for the missing entry. Of course, if you use an operation that is guaranteed to be atomic to create only Set
, the addition to Set
will happen at the same time, which requires using parallel Set
. You can use ConcurrentHashMap
to create de facto ConcurrentHashSet
(which does not exist in this form) by mapping to a fixed value, which is especially simple if you allow the value signaling to be present Boolean.TRUE
:
ConcurrentHashMap<String, Set<File>> map=new ConcurrentHashMap<>();
final BiConsumer<? super String, ? super Set<File>> action =
(key, value) -> map.computeIfAbsent(key, x->ConcurrentHashMap.newKeySet())
.addAll(value);
source to share
I have used computeIfAbsent
and filtered files by ".txt" extension. The results are displayed in the console. For IntelliJ IDEA IDE: if the result is incomplete, check and increase the "Override buffer buffer size" value ("File / Settings / Editor / General / Console").
Import list:
import java.io.File;
import java.io.IOException;
import java.io.UncheckedIOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.*;
import java.util.concurrent.ConcurrentHashMap;
import java.util.stream.Collectors;
import java.util.stream.Stream;
My solution to this problem:
public static void printMap(Path path){
try (Stream<Path> stream = Files.walk(path)){
ConcurrentHashMap<String, Set<File>> map = new ConcurrentHashMap<>();
stream.parallel()
.filter(p -> !(Files.isDirectory(p)) & p.getFileName()
.toString()
.toLowerCase()
.endsWith(".txt"))
.collect(Collectors.toList())
.forEach((p) -> {
try {
Files.lines(p, StandardCharsets.UTF_8)
.flatMap(s -> Arrays.asList(s.split("\\PL+")).stream())
.filter(w -> w.length() > 0)
.map(String::toLowerCase)
.parallel()
.forEach(
key -> {
Set<File> tempSet = new HashSet<>();
tempSet.add(new File(p.toString()));
map.computeIfAbsent(key, x -> ConcurrentHashMap.newKeySet())
.addAll(tempSet);
});
} catch (IOException e){
} catch (UncheckedIOException e){}
});
map.entrySet().stream()
.sorted(Map.Entry.comparingByKey())
.forEach(System.out::println);
} catch (IOException e){}
}
To call printMap()
:
public static void main(String[] args){
Path path = Paths.get(*somePathName*);
printMap(path);
}
If you need to use merge
, just replace
map.computeIfAbsent(key, x -> ConcurrentHashMap.newKeySet()).addAll(tempSet);
to
map.merge(key, tempSet, (oldSet, newSet) -> {oldSet.addAll(newSet); return oldSet;});
source to share