FREE, online, self-paced, learning for Hazelcast at the Hazelcast Training Center. Learn More »
Don't miss the upcoming webinar: Building Real-Time Data Pipelines with a 3rd Generation Stream Processing Engine - sign up now!

Hazelcast Jet Features


Engineered for Performance

To achieve very high throughputs with consistent low latency, Hazelcast Jet uses a combination of distributed DAG computation model, in-memory, data locality, partition mapping affinity, SP/SC Queues and Green Threads.

See Hazelcast Jet Performance for explanation and benchmarks comparing Jet to other popular frameworks.

Low Latency End-to-End

The performance of Jet Core can be boosted by using embedded Hazelcast IMDG. Hazelcast IMDG provides elastic in-memory storage and is a great tool for publishing the results of the computation, or as a cache, for data sets to be used during the computation. Very low end-to-end latencies can be achieved this way.

Stream Processing

Streaming Core

Hazelcast Jet is built on top of a low latency streaming core. This refers to processing the incoming records as soon as possible as opposed to accumulating the records into micro-batches before processing.


To process unbounded, possibility infinite sequences of records, a tool is required to group individual records to finite frames in order to run the computation. Hazelcast Jet windows provides a tool to define the grouping.

Types of windows supported by Jet:

  • Fixed/tumbling – The stream is divided into same-length, non-overlapping chunks. Each event belongs to exactly one window.
  • Sliding – Windows have fixed length, but are separated by a sliding interval, so they can overlap.
  • Session – Windows have various sizes and are defined based on session identifiers contained in the records. Sessions are closed after a period of inactivity (timeout).

Event Time Processing

Hazelcast Jet allows you to classify records in a data stream based on the time stamp embedded in each record — the event time. Event time processing is a natural requirement as users are mostly interested in handling the data based on the time that the event originated (the event time). Event time processing is a first-class citizen in Jet.

For handling late events, there is a set of policies to determine whether the event is “still on time” or “late”, which results in the discarding of the latter.

Handling Back Pressure

In the streaming system, it is necessary to control the flow of messages. The consumer cannot be flooded by more messages than it can process in a fixed amount of time. On the other hand, the processors should not be left idle wasting resources.

Hazelcast Jet comes with a mechanism to handle back pressure. Every consumer keeps signaling to all the upstream producers how much free capacity it has. This information is naturally propagated upstream to keep the system balanced.

Batch Processing and ETL

Batches Disguised as Streams

Although Jet is based on a streaming core, it is often used on top of bounded, finite data sets (often referred to as batch tasks). Jet treats such data sets as if it were a stream that suddenly ends. Batches are processed, just as streams, in the same one-record-per-time streaming core.

Connectors for ETL

Hazelcast Jet includes connectors for Hadoop Distributed File System (HDFS), local data files (e.g., CSV or logs), and Hazelcast IMDG. All of these should be combined in one pipeline to take advantage of the benefits of each, e.g., reading large data sets from HDFS and using distributed in-memory caches of Hazelcast IMDG to enrich processed records.

Processing Hadoop Data

Hadoop Distributed File System (HDFS) is a common tool used for building data warehouses and data lakes. Hazelcast Jet can use HDFS as either data source or destination. If Jet and HDFS clusters are colocated, Jet benefits from the data locality and processes the data from the same node without sending them over the network.

Taking Advantage of Hazelcast In-Memory Data Grid

Hazelcast Jet is the distributed data processing tool that takes advantage of being integrated with the Hazelcast IMDG (In-Memory Data Grid) — an elastic in-memory storage.

Use IMDG for

  • Data ingestion prior to processing.
  • Connecting multiple Jet jobs using IMDG as an intermediate buffer to keep jobs loosely coupled.
  • Enriching processed events, cache remote data, e.g., fact tables from a database on Jet nodes making use of data locality.
  • Distributing Jet processed data with Hazelcast IMDG.
  • Running advanced data processing tasks on top of Hazelcast data structures.

Jet Embeds IMDG

The complete Hazelcast IMDG is embedded in Jet. So, all the services of IMDG are available to your Jet jobs without any additional deployment effort. As Hazelcast IMDG is the embedded, support structure for Jet, IMDG is fully controlled by Jet (start, shutdown, scaling etc.)

To isolate the processing from the storage, you can still make use of Jet reading from or writing to remote Hazelcast IMDG clusters.

Streaming from IMDG

In Jet a connector is included which allows the user to process streams of changes (Event Journal) of an IMap and ICache, enabling developers to stream process IMap/ICache data or to use Hazelcast IMDG as storage for data ingestion.

The Hazelcast Way

Hazelcast Jet was built by the same community as the Hazelcast IMDG. Jet and Hazelcast IMDG share what’s best about the community experience.


You will get the value from Hazelcast Jet in less than 15 minutes.

Add one dependency to your Maven project and start building — Jet is a small JAR with no dependencies. Adding a member (also called node) to a Jet cluster is as easy as running one script. The Pipeline API keeps all the complexity under the hood.

Try it!

Lightweight and Embeddable

Jet is not a server, it’s a library. It’s natural to embed it in your application to build a data processing microservice. Due to its lightweight, each job can be easily launched on its own cluster to maximize service isolation. This is in contrast to heavyweight data processing servers where the cluster, once started, hosts multiple tasks and tenants.


Hazelcast Jet cluster members (also called nodes) automatically join together to form a cluster. Discovery finds other Jet instances based on filters and provides their corresponding IP addresses. Cluster setup is simple.

The cloud discovery plugins can be used for easy setup and operations within any cloud environment.


Get professional support from the same people who built the software.

Open Source with Apache 2 License

Hazelcast Jet is open source and available under the Apache 2 license.



Hazelcast Jet is elastic — it is able to dynamically rescale to adapt to the workload changes.

When the cluster extends or shrinks, running jobs can be automatically replanned to make use of all available resources.

Fault Tolerance

Hazelcast Jet is able to tolerate faults such as network failure, split or node failure.

When there is a fault, Jet uses the latest state snapshot and automatically restarts all jobs that contain the failed member as a job participant from this snapshot.

Exactly-Once or At-Least Once

Hazelcast Jet supports distributed state snapshots. Snapshots are periodically created to back up the running state. Periodic snapshots are used as a consistent point of recovery for failures. Snapshots are also taken and used for an up-scaling.

For snapshot creation, exactly-once or at-least-once semantics can be used. This is a trade-off between correctness and performance. It is configured per job.

Resilient Snapshot Storage

Hazelcast Jet uses the distributed in-memory storage to store the snapshots. This storage is integral component of the Jet cluster, no further infrastructure is necessary. Data are stored in multiple replicas distributed across the cluster to increase the resiliency.


Pipeline API

Pipeline API is the primary API of Hazelcast Jet. The general shape of any data processing pipeline is drawFromSource -> transform -> drainToSink. Pipeline API follows this pattern.

The Jet library contains a set of Transforms covering standard data operations (map, filter, group, join). Also there are Source and Sink adapters availabe. The library can be extended by custom Transforms, Sources or Sinks.

Visit the code sample repository to learn how to use Pipeline API.

Core API

Core API is a Jet’s low-level API that directly exposes the computation engine’s raw features (DAGs, partitioning schemes, vertex parallelism, distributed vs. local edges, etc.).

The purpose of Core API is to serve as the infrastructure on top of which to build high-level DSLs and APIs that describe computation jobs.

Use Core API for low-level control over the data flow, for fine-tuning performance or for building DSLs.

Source and Sink Connectors

Hazelcast Jet contains a library of connectors enabling Hazelcast Jet jobs to read from and write to various sources and sinks.

Hazelcast IMDG

Hazelcast Jet comes with readers and writers for Hazelcast distributed data structures: IMap, ICache and IList. For IMap and ICache, Jet can be used to read and process the stream of changes (event journal).

Jet can use structures on the embedded Hazelcast IMDG cluster and make use of data locality — each processor will access the data locally.


Batch and streaming file readers and writers can be used for either reading static files or watching a directory for changes.


Socket readers and writers can read and write to simple text based sockets.


Kafka Streamer is used to consume items from one or more Apache Kafka topics. Kafka Writer is used to write output data into Apache Kafka.


Hazelcast Jet can use HDFS as either data source or destination. If Jet and HDFS clusters are colocated, then Jet benefits from the data locality and processes the data from the same node without sending them over the network.

Log Writer

Log Writer is a sink which logs all items at the INFO level.


Reads or writes the data from/to relational database or another source that supports the standard JDBC API. Supports parallel reading for partitioned sources.


Streams messages from/to a JMS queue or a JMS topic using a JMS Client on a classpath.


Jet provides a convenient programming interface allowing you to write your own connectors for both batch and streaming.

Ready for Cloud Deployment

Run Hazelcast Jet in the Cloud

Cloud developers can easily drop Hazelcast Jet into their applications. Jet can work in many cloud environments and can be easily extended via Cloud Discovery Plugins to more.

Docker for Hazelcast Jet

Hazelcast Jet includes container deployment options for Docker.

Hazelcast Jet for Pivotal Cloud Foundry

Cloud Foundry users are now able to provision and scale Hazelcast Jet clusters — adding and removing nodes seamlessly as required. After downloading the Hazelcast Jet for PCF tile from the Pivotal Network website, the PaaS Operator simply uploads the tile into Pivotal Ops Manager, enters any required configuration into the GUI, and clicks ‘Install’. After installation, the PaaS Operator need have no further interaction with the tile.

Try Jet in 5 Minutes Jet Use Cases

Hazelcast Jet

Main Menu