Guide to Streams in Java (including after 8)

March 28, 2020

by — Posted in Core Java

Hello! This post provides an overview of streams API in Java. It introduces a concept of stream, how to build pipelines and explores common intermediate and terminal operations.

What is stream in Java?

Streams were introduced in Java 8 and were updated in next releases. Documentation describes a stream as a sequence of elements supporting sequential and parallel aggregate operations. Please don’t confuse the word “stream”: even before 8th version, Java had InputStream and OutputStream, but these concepts have nothing in common with the hero of this post. Java streams API, that was introduced in Java 8 is an implementation of monad pattern – a concept that was brought from functional languages. There, monads stand for computations that are defined as a sequence of steps.

Let have a look on a simple case that was written in a traditional manner. Here we iterate through the list of names and print only if value is equal to Anna:

List<String> names = Arrays.asList("Anna", "Bob", "Carolina", "Denis", "Anna", "Jack", "Marketa", "Simon", "Anna");

for (String name: names){
    if (name.equalsIgnoreCase("Anna")){
        System.out.println(name);
    }
}

What we do here is that we find all Annas in our list and just print them. This is a simple operation, but, nevertheless, requires us to write a lot of code for such ridiculuously tiny case!

Take another code snippet:

List<String> names; // same names as before

names.stream().filter(e -> e.equalsIgnoreCase("Anna")).forEach(e -> System.out.println(e));

Same task, but now it takes only one line of code. What did we do here? We built a pipeline:

  1. Find all names equal to Anna
  2. Print each of them

This pipeline consists of an intermediate (fliter()) and terminate (forEach()) operations, that we will observe later in this post.

Create streams

Stream is a programming abstraction, so it is not equal to collection, but we create it from collection. These concepts are often mixed by developers, that start with functional Java, but we need to distinguish them. There are several ways to initialize streams.

From collections

This is an easiest and most obvious one. Java’s Collection interface has a built-in method stream() that returns a sequential Stream with this collection as its source. Take a look on a code snippet below:

final List<String> names = Arrays.asList("Alejandra", "Beatriz", "Carmen", "Dolores", "Juanita");
Stream<String> namesStream = names.stream();
assertThat(namesStream).isInstanceOf(Stream.class);

Generating streams

If you don’t have a collection of defined data, you can generate data for stream. This may be useful for experementing with streams API methods. We need to provide a Supplier that is used to generate a random sequence of elements. Method generate returns an infinite sequential unordered stream. Here is an example:

IntStream rangeStream = IntStream.range(1, 10);
assertThat(rangeStream.toArray()).hasSize(9).contains(9).doesNotContain(10);

DoubleStream and LongStream also provide a special method range that we also can utilize to generate a stream. Take a look on a code snippet below:

There are two methods range and rangeClosed, that differ on a base is a range contains upper limit number or not. rangeClosed method returns a range that includes both limits, while range excludes a second value from results.

ofNullable

Another static method that is used to create streams is ofNullable. It allows us to create a stream containing a single element or empty one (in case of null). This method was introduced in Java 9.

Find a code below:

Person nullPerson = null;
Stream<Person> nullableStream = Stream.ofNullable(nullPerson);
assertThat(nullPerson).isNull();
assertThat(nullableStream).isInstanceOf(Stream.class);

of

Another worth to look method to create streams is of. There are two overloaded versions of this method:

  • of (T element)
  • of (T... elements)

In the first case, it returns a sequential Stream containing a single element T. In the second one, it returns a sequential ordered stream whose elements are the specified values. The second version uses varargs as an argument. This code snippet illustrates this method:

Stream<Point> pointsStream = Stream.of(new Point(1,2), new Point(5, -12), new Point(-9, 4), new Point(6, 1), new Point(0,0));
assertThat(pointsStream).isInstanceOf(Stream.class);

iterate

This method was also introduced in Java 9. iterate takes two parameters: an initial value (called seed) and UnaryOperator that produces a sequence. The method starts with the seed value and iteratively applies the given function to get the next element. Here is an example:

Stream<Integer> iterateStream = Stream.iterate(0, i -> i + 2);
assertThat(iterateStream).isInstanceOf(Stream.class);

Empty stream

Finally, we can always create an empty stream. We mentioned ofNullable method that could return an empty stream, but there is another approach to get explicitly empty stream. empty method returns an empty sequential Stream:

Stream<Integer> emptyStream = Stream.empty();
assertThat(emptyStream.toArray()).isEmpty();

What about Builder?

We explored static methods that are used to create streams. But despite them, there is another way to do it: use Builder. Stream.Builder allows the creation of streams by generating elements individually and adding them to builder without temporary collections or buffers. Let have a look on it:

// 1. create builder
Stream.Builder<String> builder = Stream.builder();

// 2. create stream
Stream<String> nameStreamBuilder = builder.add("Alejandra").add("Beatriz").add("Carmen").add("Dolores").add("Juanita").build();

// contains same elements as the first example
assertThat(namesStream.collect(Collectors.toList())).containsExactlyElementsOf(nameStreamBuilder.collect(Collectors.toList()));

Builder is an another approach to build streams. We initialize a Stream.Builder instance and then, using add method populate it with values. Finally, we transform Builder to Stream by build method.

Assemble a pipeline

We took a broad introduction to the subject of stream creation and observed key ways to do it. Now, as we obtained a stream instance we can asseble a pipeline in order to do something useful with the stream. From technical point of view, a pipeline consists of a source (Collection or generator function); followed by zero or more intermediate operations and a terminal operation. The graph below represents a concept of pipeline:

Graph 1. Stream Pipeline

In this section we briefly explore role of intermediate and terminal operations and observe most notable of them.

Intermediate operations

Intermediate operations return new stream and are lazy. Their laziness means that the actual computation on the source data is performed only after the terminal operation is invoked, and source elements are consumed only as needed. We can chain multiple intermediate operations, as each returns a new Stream object. Take a look on the graph below:

Graph 2. Streams intermediate operations

Let now have a quick look on most used intermediate operations.

Filter

In the beginning of the post we already used this operation in order to filter a collection and find matching names. In a nutshell, it returns a new stream consisting of the elements that match the given condition. This method accepts a predicacte, which specifies a condition.

@Test
public void filterTest(){
    Stream<String> stream = getNames().stream();
    long result = stream.filter(n -> n.startsWith("A")).count();
    assertThat(result).isEqualTo(4);
}

Map

There are several map operations, and I decided to group them together under one header. Let start with general map method. It returns a new stream consisting of the results of applying the mapper function to the elements of the stream. Here is an example code:

@Test
public void mapTest(){
    Stream<String> stream = getNames().stream();
    int result = stream.mapToInt(n -> n.length()).sum();
    assertThat(result).isEqualTo(43);
}

There are several specific mapping methods:

  • mapToInt = produces an IntStream consisting of the results of applying the given mapper function
  • mapToDouble = produces a DoubleStream consisting of the results of applying the given mapper function
  • mapToLong = produces a LongStream consisting of the results of applying the given mapper function

Distinct

Another notable intermediate operation in Java Stream API is distinct. It produces a stream of unique elements from the data. From technical point of view, distinct method works with equals of enitites in order to avoid duplicates. For ordered streams, the selection of distinct elements is stable, while for unordered ones, Java provides no stability guarantees.

@Test
public void distinctTest(){
    List<Integer> numbers = Arrays.asList(1, 1, 2, 3, 3, 4, 5, 5); 
    Stream<Integer> stream = numbers.stream();
    long result = stream.distinct().count();
    assertThat(result).isEqualTo(5);
}

In case you want to distinct own custom entities, you have to override equals and hashCode in order to distinct unique elements. I advice you to go read about overriding hashCode and equals.

Sort

Sorting is an another important task that we have to perform with streams. sorted method is an intermediate operation that provides a stream consisting of the elements of this stream, sorted according to natural order:

@Test
public void sortTest(){
    List<Integer> numbers = Arrays.asList(-9, -18, 0, 25, 4); 
    Stream<Integer> stream = numbers.stream();
    List<Integer> result = stream.sorted().collect(Collectors.toList());
    assertThat(result).containsAll(numbers).containsExactly(-18, -9, 0, 4, 25);
}

Again, this is how sorting works with numbers. With custom entites you need to implement Comparable, otherwise, ClassCastException will be thrown when terminal operation executes.

While

These two methods were also added since Java 9 release: dropWhile and takeWhile. Both are intermediate operations that accepts predicate with condition.

  • dropWhile = produces a stream consisting of the remaining elements of this stream after dropping the longest prefix of elements that match the given predicate.
  • takeWhile = produces a stream consisting of the longest prefix of elements taken from this stream that match the given predicate.

Note, that both works with ordered streams.

Take a look on example code snippet below:

@Test
public void whileTest(){
    Set<Integer> numbers = Set.of(1,2,3,4,5,6,7,8);
    Stream<Integer> stream = numbers.stream();
    Set<Integer> result = stream.takeWhile(x-> x < 5).collect(Collectors.toSet());
    assertThat(result).hasSize(4);
}

Limit

Last intermediate operation that we will observe in this post is limit. It produces a stream consisting of the elements, limited to be no longer than specified length. This method accepts one argument – long value that represents a required length.

@Test
public void limitTest(){
    Stream<Integer> stream = getNumbers().stream();
    Set<Integer> result = stream.sorted().limit(5).collect(Collectors.toSet());
    System.out.println(result);
    assertThat(result).contains(-75, -18, -9, -5, 0);
}

Terminal operations

The other group of operations is called terminal operations. Compare to intermediate operations, there is only one terminal operation that is executed on stream, because after it will be performed, the stream pipeline is consumed, and can no longer be used. Terminal operations produces some result (usually as Java object, like collection or Optional), not streams:

Graph 3. Streams terminal operations

There are several notable terminal operations that we will explore here.

For each

We used this operation in most examples before. This method accepts Consumer function that defines an action to perform on each element of the stream. You remember, that in the beginning of the post we compared two ways of doing this task:

@Test
public void forEachTest(){
    Stream<String> stream = getNames().stream();

    stream.filter(n -> !n.equalsIgnoreCase("Anna"))
            .map(n -> n.toUpperCase())
            .forEach(n -> System.out.println(n));
}

Note, that for parallel stream pipelines, this operation does not guarantee to respect the encounter order of the stream, as doing so would sacrifice the benefit of parallelism. For any given element, the action may be performed at whatever time and in whatever thread the library chooses. If the action accesses shared state, it is responsible for providing the required synchronization.

Collect

The previous terminal operation has no return: it consumes data, but does not provide something back. However, often we need to perform some stream operation on collection and then get changed collection back. In these situations we use collect method. It does a mutable reduction operation on the elements of this stream using collector.

@Test
public void collectTest(){
    Stream<String> stream = getNames().stream();
    List<Integer> result = stream.filter(n -> n.length() <= 4)
                    .map(n -> n.length()).collect(Collectors.toList());
    assertThat(result).hasSize(5);
}

In this code snippet we use built-in Collectors to collect stream to list. There are other useful methods that Java provides to us out of the box:

  • Collectors.toMap
  • Collectors.toSet

Find

Finally there are operations to find an element that return Optional, because element could not be presented:

  • findAny()
  • findFirst()

Both of them do not have any arguments, so you may ask a very reasonable question: how do they actually find data? These methods work in combination with filter, that we described earlier.

Take a look on example:

@Test
public void findTest(){
    Stream<String> stream = getNames().stream();
    Optional<String> result = stream.filter(n -> n.length() < 4).findFirst();
    assertThat(result).isPresent();
    assertThat(result.get()).isEqualToIgnoringCase("Bob");
}


What is a difference between these two methods? As names imply, findFirst = returns matching element first occured. In our case they are both Anna. findAny returns any matching element, that can be first or can be not: behavior of this operation is explicitly nondeterministic; it is free to select any element in the stream.

Source code

You can find the full source code for this post in this github repository. If you have questions regarding this post, don’t hesitate to contact me. Have a nice day!

References

  • Chris Hermansen Don’t like loops? Try Java Streams (2020), Opensource.com, read here
  • Nicolais Frankel Java streams and state (2019) Java Geek, read here
  • Saeed Zarinfam Streams API New Features after Java 8 (2019) ITNext, read here