SPARQL Data Platform

Given the various postings on SPARQL recently, I thought it worth noting down the various data platform options I’ve considered:

  1. For pure PoC’ing, MySQL using the file import facility in MySQLWorkbench, running D2RQ to provide SPARQL access.  Simple, and easy to setup.
  2. For more of a Hadoop platform, HBase with Apache Phoenix offering a JDBC driver, again allowing D2RQ to be used as the SPARQL access layer.
  3. Apache Marmotta, in many ways an improvement on Option 1 above, since it sits on top of standard database technology.

Option 1 is probably the quickest to move forwards with, once you’ve become annoyed with accessing corporate data that is spread across n systems, and your still in Machine Learning Discovery land 🙂  If you’ve used Apache Marmotta, or have time to set it up and learning the platform, Option 3 maybe a better bet.

Option 2 is probably the production version, or at least a stab in the right direction, as it offers improvements on scaling, coupled with Hadoopness 🙂

Where’s all this going?  “SPARQL with R in less than 5 minutes” provide a quick and interesting read on the power of SPARQL.  If your building a data lake without a foundation (ontology), you maybe missing a trick.

Interested in anyone else’s options

~ by mdavey on May 10, 2016.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: