20 April 2020
This blog looks at the challenge of taking machine learning (ML) models from the data science notebook running on a laptop to integrating ML as a robust part of a live service. The Challenge Productionising a machine learning model is partly an organisational and cultural challenge. The area is still extremely new and even experts…
14 June 2017
Tom Swann blogs about a recent project he worked on and how the development team communicated about design.
Conference highlights from San Jose
06 April 2016
Strata and Hadoop World is the world’s biggest and best conference on all aspects of the data economy. I had the pleasure of attending this year’s event in San Jose, and below you can find my thoughts on the major conference themes. Hadoop continues to mature With each passing year, it becomes more difficult to pin down…
Data Science at Scale
01 July 2015
Apache Spark is one of the biggest reasons that data analytics is such an exciting area of work for technologists such as myself right now. It’s hugely popular, with the most active community of any open source big data project currently in development. So, this post is an overview of Spark, the problems that it solves…
Algorithms are not the point
23 February 2015
In this post, I’m turning my attention to the often misunderstood term “Data Science” and what it actually means to businesses today in a practical sense. I would argue that a clear understanding of the business applications of advanced analytical techniques is one of the biggest gaps currently facing their widespread adoption. From a delivery perspective there is the need for data…
30 September 2014
In recent years Hadoop has proven to be a disruptive technology in the world of data storage and processing. It’s rise to prominence as an open source platform for performing “big data” analytics has shaken up how many companies think about the ways in which they can transform and derive value from their most important data assets….