Don't miss the upcoming webinar: Building Real-Time Data Pipelines with a 3rd Generation Stream Processing Engine - sign up now!
Get it on Github

Markov Chain Generator

Tags:Batch ProcessingAggregationIMDG StorageFile SourcePipeline API

This demo generates a Markov Chain with probabilities based on supplied classical books.

Markov Chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event.

To build the Markov Chain, Jet reads the books stored in text files and converts consecutive words to pairs. To compute the probability of finding word B after A, one has to know how many pairs contain word A as a first entry and how many of them contain B as a second entry. The aggregation features of Jet are used to do the heavy lifting. Jet runs the computation in parallel to make use of all available processor cores.

The chain is then used to generate random sentences.

DAG Visualization

Output

Hazelcast Jet

Main Menu