Fast Batch Processing

In batch processing, the complete dataset is assembled and available before a job is submitted for processing. Although Hazelcast Jet is build on top of a streaming core, it is a great tool for building batch processing applications. For Hazelcast Jet, the batch dataset is a specific stream that ended as all the data are processed.

Hazelcast Jet comes with connectors for Hazelcast IMDG distributed Map and List, Hadoop Distributed File System and from local data files (e.g. CSV or logs).

Hazelcast Jet can use Hazelcast Maps and Lists on the same cluster as sources and sinks of data boosting the performance by making use of data locality. A Hazelcast IMap is distributed by partitions across a cluster and Jet nodes are able to efficiently read from the Map by having every node only read from their respective local partitions.

The popular Java 8 java.util.stream API can be used to implement batch Jobs as well. The computation is run as distributed and parallel.

Jet in 5 minutes

jet.jet.hazelcast.org

Main Menu