Data Science Algorithms Journey

As you venture down the road with H2O or similar tools, once you’ve got passed installing the software, the next problem is more than likely going to get, what data should I start with, and what algorithms shall I use on the data.

H2O is very easy to install – I went with what was on GitHub:

git clone

Followed by Recipe 1 – however the docs:runGenerateRESTAPIDocs step failed 😦  Running is as easy as:

java -jar build/h2o.jar

Which then brings us to Flow via:


Which finally brings us to algorithms, and data.  H2O offers the following high level overview of data preparation, machine learning and model confidence.

Data and insight go hand in hand.  In many ways, once you have same data, you’ll then identify the additional data you want to aid you on further insight.  This posting on Public Safety offers a number of thought around tacking problem.  For example, you can imagine being interested in looking at sales activity over time, married to the highs and lows of the stock market, and maybe also the duration of the sales cycle, and how its influenced buy the stock market.

Generalized Low Rank Models looks particularly interesting, especially this video.

If you have sales data, then following a similar path to the Customer Intelligence use case is probably an interesting starting point.  Its a shame a sample data set isn’t provide for the use cases.

~ by mdavey on March 14, 2016.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: