The Build/Deploy Tax

•October 27, 2016 • 1 Comment

Projects often forget about the build/deploy tax.   These days its thrown under the DevOps banner, and often being left to the appointed DevOps individual to wade though and resolve whilst the rest of the team stream ahead with adding new features to the product.

Unfortunately, similar to deciding a testing strategy at the start of a project, and evolving the strategy along the road, build/deployment should be consider in the same context.

Specifically, the build/deploy tax is the duration of time (waste) that it costs the team, from “git commit” to deployment of application to an environment e.g. development, UAT, Integration, Production.  This tax is often measured in terms of time wasted “waiting for the build/deployment process to complete”.

There are many factors that influence the build/deploy tax, a few of which are listed below:

  • Jenkins (CD server) hardware
  • Proximity of build assets e.g. locations of source code repository, nexus, npm, docker registry,  test/prod environments
  • Structure of source code repository and asset deployment

Often the above issues are hotly debated by both the team, IT support and finance.  Examples being:

  • Source code be in a single repository or multiple (e.g by micro service)
  • Monolithic deployment, or deployment based on services that have changed
  • Leverage cloud services e.g. DockerHub or run everything within a separate environment e.g. AWS

Unfortunately, there often isn’t a simple solution to many of these problems.  AWS maybe right to running all your build/deployment infrastructure, but the last mile to Prod may still be “painful” due to Prod not being in AWS.  Maybe this is an acceptable cost?

Commodity Smart Contracts

•October 24, 2016 • Leave a Comment

Great to see another step forwards in real world usage of blockchain -Aussie Bank’s 7000-Mile Blockchain Experiment Could Change Trade.

Commonwealth Bank of Australia, Wells Fargo and Brighann Cotton with this experiment are aiming to reduce the mountain of paper used to track commodity deliveries around the world

As port staff scan the bales, an update to an electronic contract will be triggered, transferring ownership of the goods and authorizing the release of payment. The deceptively-simple sounding process is only possible because digital-ledger technology encrypts and stores the parameters of the contract, ensuring all parties are working off the same synchronized version, which cannot be unilaterally altered or tampered with.

Its also interesting to see Maersk and IT University of Copenhagen undertaking an experiment with blockchain in the bills of lading space.

Software costs that companies don’t want to pay for

•October 19, 2016 • 2 Comments

Software costs money.  I realise this is obvious, but its more interesting to look at what companies are prepare to pay for, and what they are not.

Most companies are happy to pay for UI/UX work, as they see what they get, and they perceive the are getting value because of the visual nature of the deliverable

Similarly, paying for the creation of documents seems acceptable to most companies.

What companies don’t seem interested in paying for is:

  1. Continuous delivery pipeline creation (e.g. Jenkins pipeline), and the ongoing maintenance costs (turning etc)
  2. Server side architecture
  3. Integration
  4. Testing/performance
  5. Exploratory costs before work is started
  6. Refactoring to allow teams to maintain velocity

The first bullet point above is shocking, because without a few cycle time, the companies are actually paying every day, or maybe even every hour, which is going to impact velocity, which the company will pay for in some shape or form

Disappointing 😦

Data Science into Production

•October 19, 2016 • Leave a Comment

“What is hardcore data science—in practice?” is a great article on moving data science into production.  Figure 5 (image from article courtesy of Mikio Braun) nicely captures the two worlds,  capturing the data science exploratory world where “done” is a model that can be used in production.


The article is spot on with regards to the conclusion, don’t silo data scientist and engineers.  My preference is to have them all sitting together as part of one team 🙂

No QAs

•October 18, 2016 • Leave a Comment

Interesting article from a Scrum Master at Sky, “So we’re going “No QAs”. How do we get the devs to do enough testing?”.

I find it hard not to agree with a number of points in the article.  QAs in my mind are a get out of jail card for engineering to throw code over the wall and run to the next story.  Further, its a great excuse to avoid writing tests which a lot of engineer find boring.  Finally, its a great reason why TDD, and BDD (my preference) is largely ignored on a lot of projects, allowing engineer to assume the role of a “hacker”.

dedicated QAs allows developers to abdicate the responsibility for quality to someone else

“Hacker” engineer is what a lot of engineers are.  They don’t design the solution, or consider how to get to “done”.  They write the code, and refactor until though basic minimal manual testing, the solution does the right thing.  They then write a few tests, commit, and in their minds, they are done.  No QAs forces the engineering team to own responsibility and eat their own cut corners.

Without QAs, the devs have a vested interest in speeding up testing by automation (which pure testers don’t; why would they put themselves out of a job?)

The QA world has gone though various re-branding activities in the last few years.  More recently the “Check and Test” re-brand.  At the end of the day, a lot of defects are either due to the engineering solution, and lack of appropriate engineering tests (unit, integration, etc), or due to the problem not being fully defined during analysis/discussion of the story.

QAs usually want to hold up a build until every possible test has been run.  Unfortunately this isn’t the real world anymore.  Its not uncommon to make n releases a day, hour, week.  Further, business demands can mean that you need to release with defects, but the defects aren’t business critical.  We should also remember that the world isn’t perfect, and we are today, with technology, never going to release defect free software.  What is actually more important is having a continuous deployment pipeline that is efficient, so that if we find defects, we can release to resolve the issue.

Which  brings us to defects.  Defects are defects when the acceptance criteria/tests fail, and the UI doesn’t look like the UX.  Are defects really defects when the story didn’t provide the appropriate information?  Or are they new features that emerged, that continue to refine the solution.  QAs in my view are hung up in defect tracking etc.  Maybe some QAs should consider looking for work in the Business Analysis land, or becoming engineers that can improve the standards of code delivered to production.  At the end of the day, the solution is only worth something when its in production, and being used.  Until that time, its waste, with no payback

Data Analytics Platform Reading

•September 29, 2016 • Leave a Comment

Few interesting articles recently worth a read, specifically around the (fast) data platform space:

  • A Guided Way To Manage Data In Motion For Streaming Applications
  • Swisscom Q/A On Choosing Scala And Spark For New Streaming Data Platform – particularly interesting, as its a real world application.  Nice to see Kafka in the stack :).  Nice to see User Experience (UX) and data privacy touched on by the article
  • The Next Generation Data Science Toolkit
  • Spark ML Data Pipelines – provides the usual stats on data cleanup.  Nice list of a few tools that readers may find useful e.g ActiveClean etc.  Word2Vec seems to be in a lot of conversations these days.  Not used BlinkDB.
  • Why data is the new coal
  • At the bleeding edge of AI: Quantum grocery picking and transfer learning
  • PaddlePaddle – text classification system overview
  • The Barclays Data Science Hackathon: Using Apache Spark and Scala for Rapid Prototyping
  • DevOps and Big Data: Rapid Prototyping for Data Science and Analytics

Do you pass the “Joel Test” for Data Science?

•September 22, 2016 • Leave a Comment

Interested to know how many teams score well on the “Joel test”for Data Science.

Number one killer questions for any project, is “Can new hires get set up in the environment to run analyses on their first day?”.  Most projects I’ve seen fail on this.  Docker maybe your friend here 🙂

“Can predictive models be deployed to production without custom engineering or infrastructure work?” is a little ambiguous, but hits the nail on the head with regards to “Done” and getting into production to achieve an Return on Investment (ROI)