Data Science Archives - Lies, Damned Lies...

The Great SQL bot Bake Off: Comparing the big LLM beasts on SQL code generation

July 25, 2023

Reading Time: 9 minutes

A side-effect of all the time I spend breathing the rarified alpine air of the CDO community is that my SQL skills have become rather rusty. So I’ve been intrigued by the idea of using the code-generation capabilities of tools like ChatGPT and Bard to write SQL for me. But how good is the current crop of LLMs at creating SQL code that not only works, but generates the insight you’re actually looking for? I decided to find out.

Nasty, brutish and short: The life of the modern CDO

October 26, 2021May 26, 2021

Reading Time: 6 minutes

The 2010s were a big decade for Chief Data Officers: from a standing start ten years ago, CDO has risen to become an indispensable C-suite role, with almost two thirds of Fortune 500 organizations hiring one.

But the role of CDO, especially outside the US, is still poorly defined, and CDOs are frequently not set up for success within their organizations. Is the job a poisoned chalice?

Demystifying Data Science, Part V: AutoML

April 20, 2020March 30, 2020

Reading Time: 7 minutes

As we’ve established earlier in this post series, Data Science is a process, with quite a lot of repetitive elements. Many Data Science projects involve a familiar set of tasks to identify, clean and prepare data, before finding the best model for the scenario at hand. And despite the mystique around the whole profession, many Data Scientists spend a lot of time complaining about all this repetitive work. But any repetitive process is ripe for automation, and Data Science is no exception. Enter the field of “AutoML”.

Demystifying Data Science, Part IV: Models and Machine Learning

April 20, 2020July 2, 2019

Reading Time: 9 minutes

As I mentioned in my first post in this series, the central purpose of Data Science is to find patterns in data and use these patterns to make useful predictions about the future. It’s this predictive part of Data Science which gives the discipline its mystique; even though Data Scientists actually only spend a relatively small fraction of their time on this area compared to the more workaday activities of loading, cleaning and understanding the data, it’s the step of building predictive models which unlocks the value hidden within the data.

Demystifying Data Science, Part III: Data Wrangling

October 8, 2018October 7, 2018

Reading Time: 6 minutes

Ask any Data Scientist and they will tell you that the process of ‘wrangling’ (loading, understanding and preparing) data represents the lion’s share of their workload – often up to as much as 80%. However, that number is not as alarming as it may at first seem. To understand why, let me tell you about my living room.

Demystifying Data Science, Part II: Data Science vs Analytics

August 8, 2018August 8, 2018

Reading Time: 4 minutes

Ask an Analyst, particularly a Digital Analyst, how they’d like to develop their career, and they are quite likely to tell you that they want to get into Data Science. But in fact the two disciplines (if they can even be described as separate disciplines) overlap considerably – some would even say completely. So what is the difference between Analytics and Data Science?

Demystifying Data Science, Part I: What is Data Science?

August 20, 2020July 3, 2018

Reading Time: 5 minutes

There’s a lot of buzz about Data Science these days, and especially its super-cool subfield, Machine Learning. Data Scientists have become the unicorns of the tech industry, commanding astronomical salaries and an equal amount of awe (and envy) to go with them. Partly as a result of this, the field has developed something of a mystical aura – the sense that not only is it complex, it’s too complex to explain to mere mortals, such as managers or business stakeholders.

It’s true that mastery of Data Science involves many complex and specialized activities, but it’s by no means impossible for a non-Data Scientist to build a good understanding of the main building blocks of the field, and how they fit together.