As we’ve established earlier in this post series, Data Science is a process, with quite a lot of repetitive elements. Many Data Science projects involve a familiar set of tasks to identify, clean and prepare data, before finding the best model for the scenario at hand. And despite the mystique around the whole profession, many Data Scientists spend a lot of time complaining about all this repetitive work.
As I mentioned in my first post in this series, the central purpose of Data Science is to find patterns in data and use these patterns to make useful predictions about the future. It’s this predictive part of Data Science which gives the discipline its mystique; even though Data Scientists actually only spend a relatively small fraction of their time on this area compared to the more workaday activities of loading, cleaning and understanding the data, it’s the step of building predictive models which unlocks the value hidden within the data.