MPP: Big Data in Real Time through In-Memory Technology and Analytics
Harnessing the Power of Big Data in Real Time through In-Memory Technology and Analytics train reading lead to the following thoughts: Page 93 of the PDF talks to finance, and modelling environments (I assume risk?). With the push for more systems to become Near Real-Time (NRT), in-memory is clearly the chosen path. With in-memory, the question come as to what product/Open Source solution will assist on this path. GridGain is one such technology with offerings in this area. RBS, as blogged about previously, has gone down the Oracle Coherence route in the Big Data trade world – Operational Data Cache.
In the case of GridGain as a Big Data product, the product evolution has created three products – Compute Grid, Data Grid and Big Data (effectively a combination of the two other products). Its also clear the death of Hadoop is over sold in Big Data land.
Turning to Gigaspaces, “XAP 9.0 – Geared for Real-Time Big Data Stream Processing” touch on an important point which doesn’t often get talked about much – movement of data:
Once you’ve stored data in memory across multiple nodes, it’s important to achieve locality of data and processing. If you process an incoming event on one node, and need to read and update data on other nodes, your processing latency and scalability will be very limited. Achieving locality requires a solid routing mechanism, that will allow you to send your events to the node most relevant to them, i.e. the one that contains the data that is needed for their processing
Kognitio is another product that offers the ability to run in-memory analytics on Big Data. Interesting to see “Credit and risk management” mentioned, but with little detail. Anyone used Kognitio?