Don't miss the upcoming webinar: Building Real-Time Data Pipelines with a 3rd Generation Stream Processing Engine - sign up now!

Distributed java.util.stream

Use Java 8 Stream API

The java.util.stream API was introduced in Java 8. It provides a functional pipeline approach to processing collections in Java. It is however limited to a single JVM.

With Hazelcast Jet, Java 8 stream-based data can be boosted by making the computation distributed across multiple JVMs, for increased throughout.

Hazelcast Jet has an implementation of java.util.stream for Hazelcast IMDG IMap, JCache and IList. java.util.stream operations are mapped to a DAG and then executed, and the result returned to the user. The computation is distributed and parallelized. Jet benefits from data locality – if the data source and Jet are colocated, Jet will read local data partitions and process data in-place to avoid unnecessary shuffling.

Due to its simplicity, the Stream API is a great way to get started with Hazelcast Jet.

Map<String, Long> counts = DistributedStream
	.fromMap(lines)
	.flatMap(m -> Arrays.stream(PATTERN.split(m.getValue().toLowerCase())))
	.filter(w -> !w.isEmpty())
	.collect(DistributedCollectors.toIMap("counts", w -> w, w -> 1L, (left, right) -> left + right));

So you existing investment in java.util.stream code can be leveraged and processing sped up.

Visit code samples for distributed java.util.stream.

Jet in 5 minutes

Hazelcast Jet

Main Menu