How can I remove all duplicated strings from a Java list?

For a given list, let's say [ "a", "a", "b", "c", "c" ]

I need [ "b" ]

(only non-duplicated items) as output. Note that this is different from using the interface Set

for a job ...

I wrote the following code to do this in Java:

void unique(List<String> list) {
    Collections.sort(list);
    List<String> dup = new ArrayList<>();
    int i = 0, j = 0;

    for (String e : list) {
        i = list.indexOf(e);
        j = list.lastIndexOf(e);

        if (i != j && !dup.contains(e)) {
            dup.add(e);
        }
    }

    list.removeAll(dup);
}

      

It works ... but for a list of size 85320, it ends in a few minutes!

+3


source to share


5 answers


Best performance with a set:

    String[] xs = { "a", "a", "b", "c", "c" };

    Set<String> singles = new TreeSet<>();
    Set<String> multiples = new TreeSet<>();

    for (String x : xs) {
        if(!multiples.contains(x)){
            if(singles.contains(x)){
                singles.remove(x);
                multiples.add(x);
            }else{
                singles.add(x);
            }
        }
    }

      



It's one pass and insert, delete, and contains log (n).

+5


source


Using Java 8 Streams:



return list.stream()
    .collect(Collectors.groupingBy(e -> e, Collectors.counting()))
    .entrySet()
    .stream()
    .filter(e -> e.getValue() == 1)
    .map(Map.Entry::getKey)
    .collect(Collectors.toList());

      

+5


source


you can use the map. do the following

1. Create a map of following type Map<String, Integer>
2. for all elements
       check if the string is in hashmap
             if yes then increment the value of that map entry by 1
       else add <current element , 1>
3. now your output are those entries of the Map whose values are 1.

      

0


source


Given that you can sort a list, the most efficient way to do this is ListIterator

to iterate over adjacent elements:

List<String> dup = new ArrayList<>();
Collections.sort(list);
ListIterator<String> it = list.listIterator();
while (it.hasNext()) {
  String first = it.next();

  // Count the number of elements equal to first.
  int cnt = 1;
  while (it.hasNext()) {
    String next = it.next();
    if (!first.equals(next)) {
        it.previous();
        break;
    }
    ++cnt;
  }

  // If there are more than 1 elements between i and start
  // it duplicated. Otherwise, it a singleton, so add it
  // to the output.
  if (cnt == 1) {
    dup.add(first);
  }
}

return dup;

      

ListIterator

more efficient for lists that do not support random access, for example LinkedList

, than using index-based access.

0


source


You can use streams

to achieve this in simpler steps like below using inline comments:

//Find out unique elements first
List<String> unique = list.stream().distinct().collect(Collectors.toList());

//List to collect output list
List<String> output = new ArrayList<>();

//Iterate over each unique element
for(String element : unique) {

    //if element found only ONCE add to output list
    if(list.stream().filter(e -> e.equals(element)).count() == 1) {
        output.add(element);
    }
}

      

0


source







All Articles