Semantic Data Lake

Continuing on the data lake road, Kafka Connect look very interesting from a way to feed a data lake.  This leads to “Prototype of Data Processing Infrastructure”, and usage of RDF and CumulusRDF.  RDF if interesting since its gained uptake due to SPARQL – semantic web.

Further searching of the web yields a very interesting article in the data analytics space, “Advanced Real-Time Healthcare Analytics with Apache Spark”.  Thankfully the article provide an appropriate architecture diagram, which is nicely using Apache Kafka 🙂  More interesting is that its using Ontologies:

The architecture is hybrid and also includes a production rule engine and an ontology reasoner. This is done in order to leverage existing clinical domain knowledge available from evidence-based clinical practice guidelines (CPGs) and biomedical ontologies like SNOMED. This approach complements machine learning algorithms’ probabilistic approach to clinical decision making under uncertainty. The production rule system can translate CPGs into executable rules which are fully integrated with clinical processes (workflows) and events. Drools supports both forward and backward chaining as well as the modeling of business processes (clinical workflows) with the business process modeling notation (BPMN). There are patterns for integrating rules and processes.

Interestingly, there is a W3C Machine Learning Schema Community Group – RDF etc. There’s also a list of projects on the Machine Learning and Ontology Engineering site.

Moving on, we find “Hadoop, Triple Stores, and the Semantic Data Lake“:

“What we’re investing most of our time in now is the semantic data lake, where we store data in a key value store in Hadoop [Hbase], but then index it with our graph database so that we can do these SPARQL queries,”

~ by mdavey on April 28, 2016.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: