End the stream with an iterator and collect?

Let's say I want to filter the list and return the filtered list, but the iterator will suffice too. Which of the following is preferred and why? Stream.iterator()

or Stream.collect(ListCollector)

.

+3


source to share


2 answers


There is a fundamental difference between Stream.iterator()

and .collect(Collectors.toList()) .iterator()

. The latter will process all the elements of the stream to store them in the collection. On the contrary, Stream.iterator()

it will simply return a wrapper around Stream

s Spliterator

that will process all the elements lazily like all other stream operations.

eg. when you write

Iterator<String> it=IntStream.range(0, 100).mapToObj(i->{
    System.out.println("processing "+i);
    return String.valueOf(i);
}).iterator();

if(it.hasNext()) System.out.println("first: "+it.next());
if(it.hasNext()) System.out.println("second: "+it.next());
return;// I don’t care about the remaining values

      

it will print:

processing 0
first: 0
processing 1
second: 1

      



and

Iterator<String> it=IntStream.range(0, 100).mapToObj(i->{
    System.out.println("processing "+i);
    return String.valueOf(i);
}).collect(Collectors.toList()).iterator();

if(it.hasNext()) System.out.println("first: "+it.next());
if(it.hasNext()) System.out.println("second: "+it.next());
return;// I don’t care about the remaining values

      

will print

processing 0
processing 1
processing 2
processing 3
processing 4
processing 5
processing 6
processing 7
processing 8
processing 9
processing 10
processing 90
processing 91
processing 92
processing 93
processing 94
processing 95
processing 96
processing 97
processing 98
processing 99
first: 0
second: 1

      

However, if you only need it Iterator

, you shouldn't be forced to collect the values ​​before querying unless you have a good reason to do so (for example, if the source is a file, you can complete the operation before returning the iterator).

+9


source


You have not suggested it as one of your alternatives, but I would suggest that you consider returning the Stream to the caller. If you want to return the caller's Iterator, a stream is probably much more convenient. ( Jean Logeart also suggested this in a comment.)

It looks like you have an internal collection of items (or whatever) and are using streams to filter on some criteria that you don't want to show to your caller. Given a filtered stream, there are several alternatives to return:

  • a List: stream.collect(toList())

  • Iterator from stream: stream.iterator()

  • Iterator from the collected list: stream.collect(toList()).iterator()

  • The stream itself: stream

It best depends on what the caller wants to do with the return value. If you know the caller always wants to keep all the filtered items, you can also make the caller service by collecting such a list yourself.

However, the caller might want to do something different. Suppose the caller wants to find a specific item, or count the number of filtered items, or see if there are any filtered items. In these cases, the collection of items on the list is mostly waste.

Returning an iterator from a filtered stream buys you laziness in that the collection is not pre-created. However, accessing the Iterator is potentially cumbersome for the caller.

If the caller needs a collection, something like this is needed:

Iterator<Item> iter = getFilteredItems();
List<Item> result = new ArrayList<>();
iter.forEachRemaining(result::add);

      

If the caller wants to find a specific element, this is worse:

Item foundItem = null;
while (iter.hasNext()) {
    Item current = iter.next();
    if (targetId.equals(current.getId())) {
        foundItem = current;
        break;
    }
}
if (foundItem != null) {
    // found it!
} else {
    // not found
}

      

Counting the number of filtered items can be done either by incrementing the counter in a loop while (iter.hasNext())

, or by incrementing it AtomicInteger

internally forEachRemaining

. Fortunately, testing all filtered items is pretty straightforward, just a call iter.hasNext()

.

Returning an Iterator over a compiled list is the worst of both worlds. You pay the upfront cost of collecting the List even if the caller does not need it, and the caller may have to do additional work to complete all of the items as shown above. Holger explained the differences well.



Finally, the return Stream offers the efficiency of laziness and is probably the most flexible. If you are already filtering to use streams, just return a stream:

Stream<Item> getFilteredItems() {
    return myInternalCollection.stream()
                               .filter(...);
}

      

If the caller wants a list of items, it's pretty simple:

List<Item> = getFilteredItems().collect(toList());

      

If the caller wants to find a specific element, it's pretty simple too:

Optional<Item> item = getFilteredItems()
                        .filter(it -> targetId.equals(it.getId()))
                        .findAny();
if (item.isPresent()) {
    // found it!
} else {
    // not found
}

      

(The class itself Optional

has a rich set of APIs that can allow you to avoid the test ifPresent()

.)

Counting objects:

long count = getFilteredItems().count();

      

and checking for any filtered items:

boolean any = getFilteredItems().findAny().isPresent();

      

Returning a stream allows the lazy to be kept as long as possible, and provides maximum flexibility to the caller.

+4


source







All Articles