Hazelcast Jet Performance
Hazelcast Jet uses a combination of DAG computation model, in-memory, data locality, partition mapping affinity, SP/SC Queues and Green Threads to achieve very high performance. These key design decisions are explained below.
Word Count is the classic Big Data sample app and is often used to compare performance between systems. Jet 0.4 is faster than all other frameworks. See complete benchmark.
Streaming Word Count involves windowing and out-of-order data processing. The latency of Jet remains flat even under higher load. See complete benchmark.
DAG to Model Computations
Similar to recent Big Data frameworks, Hazelcast Jet uses the Directed Acyclic Graph (“DAG”) abstraction to model computations. However, Jet uses some novel approaches to achieve much higher speeds than existing engines.
In-Memory Data Locality
High speed and low latency are achieved by keeping both the computation and data storage in memory. We do this by combining Hazelcast Jet with the Hazelcast In-Memory Data Grid (“IMDG”) on the same servers. Depending on the use case, some or all of the data that Jet will process will be already in RAM and on the same machine as the computation – data locality.
Partition Mapping Affinity
Jet allows you to define an arbitrary object-to-partition mapping scheme on each edge. This allows reading in parallel with many threads from each partition and member and thus server. We can use this to harmonize and optimize throughput from other distributed data systems whether it be HDFS, Spark or Hazelcast – partition mapping affinity. Thus when performing DAG processing, local edges can be read and written locally without incurring a network call and without waiting.
Local edges are implemented with the most efficient kind of concurrent queue: Single-Producer, Single-Consumer (“SP/SC”) bounded queue. It employs wait-free algorithms on both sides and avoids volatile writes by using lazySet. All of these aspects are novel to Jet.
With Jet, the number of parallel instances of each vertex (called Processors) can be defined so that we can use all the cores even in the largest machines. With many cores and execution threads, the key to Hazelcast Jet performance is to smoothly coordinate these with cooperative multithreading. Hazelcast Jet uses “Green Threads” where cooperative processors run in a loop serviced by the same native thread. This leads to:
Practically zero cost of context switching. There is hardly any logic in the worker thread needed to hand over from one processor to the next.
(Next to) guaranteed core affinity. Processors don’t jump between threads and each thread is highly likely to remain pinned to a core. This means high CPU cache hit rate.
Insight into which processor is ready to run. We can inspect the processor’s input/output queues at very low cost to see whether it can make progress.
See the docs.
Jet uses Frames as the building blocks of the sliding windows (remember, that tumbling windows are just a special case of sliding windows). Frame covers a part of a stream of a size of a sliding step. When a record arrives, it is added to the respective frame. For each frame, just the rolling accumulator is stored instead of buffering all the items. When the window is closed, respective frames are combined and the computation is executed. This provides a trade-off between the smoothness of sliding and the cost of storage/computation.
See the docs.
There is a deduct function to optimize sliding window computations. When window slides, deduct just removes the trailing frame from the sliding window and adds the new one. This means two operations instead of recomputing the whole sliding window from underlying frames with every sliding step. This leads to flat capacity of Jet even with large windows.
See the docs.Try Jet in 5 Minutes Jet Use Cases