Hazelcast Jet Performance
Jet uses a combination of DAG computation model, in-memory, data locality, partition mapping affinity, SP/SC queues and Green Threads to achieve very high performance. These key design decisions are explained below.
DAG to Model Computations
Similar to recent Big Data frameworks, Hazelcast Jet uses the Directed Acyclic Graph (“DAG”) abstraction to model computations. However, Jet uses some novel approaches to achieve much higher speeds than existing engines.
In-Memory Data Locality
High speed and low latency are achieved by keeping both the computation and data storage in memory. We do this by combining Hazelcast Jet with the Hazelcast In-Memory Data Grid (“IMDG”) on the same servers. Depending on the use case, some or all of the data that Jet will process will be already in RAM and on the same machine as the computation – data locality.
Partition Mapping Affinity
Jet allows you to define an arbitrary object-to-partition mapping scheme on each edge. This allows reading in parallel with many threads from each partition and member and thus server. We can use this to harmonise and optimise throughput from other distributed data systems whether it be HDFS, Spark or Hazelcast – partition mapping affinity. Thus when performing DAG processing, local edges can be read and written locally without incurring a network call and without waiting.
Local edges are implemented with the most efficient kind of concurrent queue: Single-Producer, Single-Consumer (“SP/SC”) bounded queue. It employs wait-free algorithms on both sides and avoids volatile writes by using lazySet. All of these aspects are novel to Jet.
Vertexes are implemented by one or more instances of `Processor` on each member. Each vertex can specify how many of its processors will run per cluster member using the `localParallelism` property so that we can use all the cores even in the largest machines. With many cores and execution threads, the key to Hazelcast Jet performance is to smoothly coordinate these with cooperative multithreading is much lower context-switching cost and precise knowledge of the status of a processor’s input and output buffers, which determines its ability to make progress.
Finally Hazelcast Jet uses “Green Threads” to allow very high throughput where cooperative processors run in a loop serviced by the same native thread. The advantages of this approach are:
- Practically zero cost of context switching. There is hardly any logic in the worker thread needed to hand over from one processor to the next.
- (Next to) guaranteed core affinity. Processors don’t jump between threads and each thread is highly likely to remain pinned to a core. This means high CPU cache hit rate.
- Insight into which processor is ready to run. With native threads, a processor would have to use a blocking concurrent queue, or at best some busy-spin-then-backoff strategy, leaving it to the OS to decide when a thread that backed off has more work to do. We can inspect the processor’s input/output queues at very low cost to see whether it can make progress.
Word Count Benchmark
Word Count is the classic Big Data sample app and is often used to compare performance between systems. Jet 0.4 is faster than all other frameworks.Try Jet in 5 minutes Jet Use Cases