250 likes | 262 Views
Learn about Cox Automotive UK's data science initiatives and the different types of data scientists. Discover the essential skills required to excel in the field.
E N D
Data Science in the Wild David Asboth & Shaun McGirr
About Cox Automotive UK Our mission: to transform the way the world buys, sells, and owns vehicles With data
About Cox Automotive UK • Valuation • Stock Monitoring & Alerts • “Data as a Service”
Data Solutions Structure • Data Engineers • Business Intelligence • Product Development • Valuations team • Data Science
Different Types of Data Science Type B Danger Zone Type A Icons made by Smashicons from https://www.flaticon.com, licensed by Creative Commons BY 3.0
Data Science Job Descriptions Type A • Data is a given • Focus is on new knowledge • Questions are clear • Measurable success (target) • Interpretability may or may not matter • Similar to a Kaggle competition Type B • Data is messy • Focus on helping decision making • Questions are ambiguous • Success is undefined • Interpretability is often important
Data Science Skills Type A • PhD in a numerical science • Knowledge of wide range of cutting edge machine learning • Deep mathematical understanding • Is a researcher at heart Type B • Experience with “real” data • Knows a few algorithms well • Understanding of business projects • Focused on pragmatic outcomes
Could you do your job just as well if your dataset was unlabelled?
Example 1: “How many blue cars will we sell tomorrow?”
Example: “How many blue cars will we sell tomorrow?” Type A Data Scientist: • Extract data from “previous sales” dataset • SELECT * FROM Sales WHERE colour=“blue” etc. • Use machine learning to predict future • Regression problem? • Time series modelling? • Done
Example: “How many blue cars will we sell tomorrow?” Type B Data Scientist: • Extract data from “previous sales” dataset • Oh…. • Let’s start by answering another question: How many blue cars did we sell yesterday? That depends…
Example: “How many blue cars did we sell yesterday?” Depends what you mean by: • Car • Blue • Out of 6,974 unique colours, 1,339 contain “blue” including: • Digital Blue • Blue Ambition • Blue/Green/Silver • Danish Blue • Yesterday • Sell
Example: “How many blue cars will we sell tomorrow?” Type B Data Scientist: • Extract data fromCreate “previous sales” dataset • Try to use machine learning to predict future • Iterate dataset as required • Done?
Example 2: Detecting Bots
Example 2: Bot Detection • You need an accurate count for the number of unique visitors • An estimated 80% of traffic is bots (spiders etc.) • Build a classifier to detect bots
Example 2: Bot Detection Type A: • Take data (which is a given) • Analyse the two classes • Build a classifier • Done • Company will use algorithm to count bots more accurately
Example 2: Bot Detection Type B: • Understand current manual process & business reasons for detecting bots • Take what data we can find • Classes labelled manually based on business assumptions • Relevant features are unknown and/or need to be calculated • Analyse the two classes • Build a classifier • Present to the stakeholders & work together on integration • Help them change how they do things using our findings • Done?
Speedometer: Revisited Type B Type A Icons made by Smashicons from https://www.flaticon.com, licensed by Creative Commons BY 3.0
Summary of skills to be a good Type B Data Scientist • Experience with dirty data • Statistics • Care about the wider context (the business and the data generating process) • Presentation and people skills
Final Thoughts If you: • Care about getting things done even in a messy world • Are excited by helping people make better decisions • Fancy yourself as an amateur philosopher Then there is a world of data science out there for you!
Questions? david.asboth@coxauto.co.uk shaun.mcgirr@coxauto.co.uk