How can I remove all duplicated strings from a Java list?
For a given list, let's say [ "a", "a", "b", "c", "c" ]
I need [ "b" ]
(only non-duplicated items) as output. Note that this is different from using the interface Set
for a job ...
I wrote the following code to do this in Java:
void unique(List<String> list) {
Collections.sort(list);
List<String> dup = new ArrayList<>();
int i = 0, j = 0;
for (String e : list) {
i = list.indexOf(e);
j = list.lastIndexOf(e);
if (i != j && !dup.contains(e)) {
dup.add(e);
}
}
list.removeAll(dup);
}
It works ... but for a list of size 85320, it ends in a few minutes!
source to share
Best performance with a set:
String[] xs = { "a", "a", "b", "c", "c" };
Set<String> singles = new TreeSet<>();
Set<String> multiples = new TreeSet<>();
for (String x : xs) {
if(!multiples.contains(x)){
if(singles.contains(x)){
singles.remove(x);
multiples.add(x);
}else{
singles.add(x);
}
}
}
It's one pass and insert, delete, and contains log (n).
source to share
you can use the map. do the following
1. Create a map of following type Map<String, Integer>
2. for all elements
check if the string is in hashmap
if yes then increment the value of that map entry by 1
else add <current element , 1>
3. now your output are those entries of the Map whose values are 1.
source to share
Given that you can sort a list, the most efficient way to do this is ListIterator
to iterate over adjacent elements:
List<String> dup = new ArrayList<>();
Collections.sort(list);
ListIterator<String> it = list.listIterator();
while (it.hasNext()) {
String first = it.next();
// Count the number of elements equal to first.
int cnt = 1;
while (it.hasNext()) {
String next = it.next();
if (!first.equals(next)) {
it.previous();
break;
}
++cnt;
}
// If there are more than 1 elements between i and start
// it duplicated. Otherwise, it a singleton, so add it
// to the output.
if (cnt == 1) {
dup.add(first);
}
}
return dup;
ListIterator
more efficient for lists that do not support random access, for example LinkedList
, than using index-based access.
source to share
You can use streams
to achieve this in simpler steps like below using inline comments:
//Find out unique elements first
List<String> unique = list.stream().distinct().collect(Collectors.toList());
//List to collect output list
List<String> output = new ArrayList<>();
//Iterate over each unique element
for(String element : unique) {
//if element found only ONCE add to output list
if(list.stream().filter(e -> e.equals(element)).count() == 1) {
output.add(element);
}
}
source to share