Kappa Architecture: Bayesian Online-­Learning model


“Applying the Kappa architecture in the telco industry” from  Strata + Hadoop World London 2016 offers a great read.  A large number of companies don’t really have a data volume problem, but they do have the “shorter time constraints, and an increasing need for accuracy” to derive business transformation though the use of AI.

The first statement that stands out from this article is:

Bayesian online-­learning model to detect novelties

There are so many use cases that customer and employee data sets from a corporations standpoint that could benefit from novelty detection.

The Kappa architecture is a nice twist on the lambda architecture that I’ve blogged about previous.  Streaming is the only way these days, especially with the evolution of Apache Spark Streaming – “batch processing is a subset of stream processing.”

“Immutable data sources” and raw data relates nicely to the ELT view of the world blogged about elsewhere over the last time period

The article unfortunately doesn’t say which ML library was used for the Bayesian anomaly detector😦

Interesting to see that Apache Flink was used and not Apache Spark.  Nice to see Apache Kafka as the data conduit🙂

 

~ by mdavey on June 8, 2016.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

 
%d bloggers like this: