Coupled of interesting articles worth a read:
Big Data is one of those buzz words that unless applied to specific business solutions is often in the stratosphere on detail. In the financial arena there has always been a large around of data in the house. Lets start with a few real-world data insight solutions which hopefully the usual suspect sell-side organisations are working too.
Straight Through Processing (STP) Efficiency
Architecting an STP solution usually comes about due to the need to resolve numerous manual steps in a workflow that is traversed frequently within an organisation. An example maybe RFQ pricing. Manually pricing an RFQ might be acceptable for a low throughput of requests, but as the request rate rises, the hiring of sales people all at some point no scale, coupled with the head count cost, and the possible human mis-keying issues. Thus at some point RFQ’s need to be automated. Although automation of the RFQ fixes one problem, unless an understanding of how the trade flows through post-trade, correlated with the steps that a support team had to undertake manually , and the latency implications of the flow depending on volume, it would be hard to identify what areas should have budget allocated to improve processing, or reduce/remote manual interaction. Big data processing of the trade cycle could provide insight to the business to direct development budget
Helping Salesperson Improve the Client Experience
Salespeople want clients to increase their trading frequency and volume – obviously. Salespeople can help to increase a clients trading by understanding the client better, and thus providing the client with better direction and opportunity. If you think about a clients interaction from a trading perspective, its either via voice, or through an electronic trading platform. Further, there is client pre-trading interaction based on what a client readings that influences the clients decision process.
If the sell-side organisation could bring together everything about all clients, from their past trading activity, what research and trade ideas the client has read, married to the voice discussions had with the bank, it should be possible to begin to better understand the client, direct the client down paths that the client hadn’t possibly considered e.g. trading new products because of the type of reading material they had read over the last month, or improving the pricing to entice the client to accept an RFQ rather than consistently see the client reject an RFQ or allow it to time out, even though the client had attempted to trade consistently over the last n hours/days/weeks.
This brings us to the concept of a sales persons insight dashboard. Such a dashboard would deliver alerts in real-time to direct sales in how to interact with the client – either through suggested reading maternal, telephone calls to interact and better understand the client, or simply to tune the electronic trading application (think Single Bank Platform (SBP)) in real-time enhance the users experience with the sell-side, delivering a stickier connection.
Apache Spark, Databricks and other such platforms can aid in delivering the big data components to empower your existing application. Specifically by delivering a stream of data that leverages machine learning capabilities
Transactional Cost Analysis (TCA)
ITG has recently disclosed the work its doing around TCA and big data. Equities offers the ability to aggregate centralised quote data feeds or actionable quote feeds which can form the basis of TCA and offering insight around trading costs. Equiduct MarketViewer is one such visualisation from an equities perspective that offers TCA features though data aggregation.
In the case of FX and institutional orders ITG has leveraged dealers and new electronic communications networks to aid in constructing an order book, which I suspect would offer a similar view to the Equiduct MarketViewer. Leveraging both historical and real-time data ingestion, big data should be able to deliver appropriate answer to questions of where an order should be submitted, and the cost implications – pre-trade analysis.
All of the above lead to a need to identify the data sources within an organisation, which then leads to a question of data quality prior to ingestion into the big data plant. Identification is probably the easier of the two. Data quality may mean that legal applications need to be improved to resolve data generation issues, or pre-processing of data to validate quality needs to occur before ingestion. There is also the issues of data holes, some of which might be able to be filled in inferring, whilst other holes may just need to be filled using classic software engineering.
The VoltDB blog has a number of interesting articles on Corporate Data Architecture. The Fast Data stack is somewhat different from a lambda architecture that has been discussed on various blog in the context of real-time big data. Clearly Fast Data is a VoltDB database based on VoltDB’s view of the world. However, summingbird and other such stacks also have merit. As does samza and Kafka.
Lets assume I want to serve knowledge and analytics from ingesting web site click data, pricing, execution and trade data, to power a salespersons trading cockpit. What should/could I use?
I’m back to coding, but this time in native Neo4j. The aim to modelling teams, and time lines of teams, is wanting the ability to ask the system “What is the best team for x?”. Which leads to a recent article on Moneyball.
Billy Beane analyzed the data on players’ performance and used the models’ outputs to select players
Food for thought
Ian’s talk on TDD: Where Did It All Go Wrong? is worth a watch. The problem Ian outlines is known by I suspect a large percentage of development teams.
Avoid testing implementation details, test behaviors
– A test-case per class approach fails to capture the ethos for TDD. Adding a new class is not the trigger for writing tests. The trigger is implementing a requirement.
– Test outside-in, (though I would recommend using ports and adapters and making the ‘outside’ the port), writing tests to cover then use cases (scenarios, examples, GWTs etc.)
– Only writing tests to cover the implementation details when you need to better understand the refactoring of the simple implementation we start with.endencies
iMogital states quite simply that Scrum/Kanban will not tell your developers how to develop. XP has 12 practices derived from software engineering best practices – TDD, Paired Programming, Coding Standards being a few. CodeCentric touch on the same issue. I suspect continuous integration, should really be continuous delivery these days. Net out, if your development teams are not familiar with XP, maybe its finally time to help them get educated, and move down the XP road.
InfoQ has a recent article on engineering practices, offering the following equation:
Scrum + Extreme Programming (XP) = Agile
One of the key call-out’s from the InfoQ article is that good engineers don’t drop best practices in high stress situations. Its unfortunate that often when there is a time crunch, testing is the first thing to get dropped. Engineers often perceive that due to time pressure, feature complete at any cost is acceptable. Unfortunately, often the team hasn’t asked the hard question back to management/stakeholders – you often can’t have your cake and eat it.
SAFe’s overview of the agile release train is worth reading. Release trains aren’t new, they been around for years in both agile and non-agile organisations. Release trains aid in a consistent push to production, and work around certain black-out dates when production can’t be touched e.g. Christmas, end of month etc. Sometimes certain train dates are also aligned to key business deliverables which although maybe anti-agile is some ways, are from a commercial perspective relevant.
One item that is often not well discussed around agile iterations is pre-iteration dependencies. Specifically these occur around interface boundaries, and the user interface (particularly if its complex). Complex interactions can clearly be broken into smaller stories, but sometimes there are dependencies and assets (business workflows etc) that maybe required in preparation prior to entering the development iteration.