Data Lake: Which RDF Data Store?

A number of triple stores are listed here.  GraphDB looks interesting, but I suspect the enterprise edition tips the scale towards an Open Source solution.  In the world of big data, I suspect you really want SPARSQL over Hadoop?

store data in a key value store in Hadoop [Hbase], but then index it with our graph database so that we can do these SPARQL queries

This is echo’d by “Avoiding Three Common Pitfalls of Data Lakes”:

Smart data technologies substantially reduce the complexity of data lake implementations while accelerating the time-to-value they produce. The graph-based model and detailed descriptions of data elements they enable substantially enhance integration efforts, enabling business users to link data according to relevant attributes that provide pivotal context across data sources and business models. Resource Description Framework (RDF) graphs are designed to incorporate new data and sources without having to re-configure existing representations. The result is considerably decreased time to a more profound form of analytics, in which users can not only ask more questions more expediently than before, but also determine relationships and data context to issue ad-hoc queries for specific needs.

Although older, still of interest “Storing (and querying) RDF in NoSQL database managers”

The paper then describes the storage and querying of RDF using HBase with Jena for querying, HBase with Hive as the query engine (with Jena’s ARQ to parse the queries before converting them to HiveQL), CumulusRDF (Cassandra with Sesame), and Couchbase

Which leads to the following possible options:

  • RDF data store used to index your data lake
  • D2R on top of your data lake

Anyone got a view?

~ by mdavey on May 3, 2016.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: