As we’ve established earlier in this post series, Data Science is a process, with quite a lot of repetitive elements. Many Data Science projects involve a familiar set of tasks to identify, clean and prepare data, before finding the best model for the scenario at hand. And despite the mystique around the whole profession, many Data Scientists spend a lot of time complaining about all this repetitive work. But any repetitive process is ripe for automation, and Data Science is no exception. Enter the field of “AutoML”.
A side-effect of all the time I spend breathing the rarified alpine air of the CDO community is that my SQL skills have become rather rusty. So I’ve been intrigued by the idea of using the code-generation capabilities of tools like ChatGPT and Bard to write SQL for me. But how good is the current crop of LLMs at creating SQL code that not only works, but generates the insight you’re actually looking for? I decided to find out.