Data Scientist vs

Data Engineer:

What's The Difference?

DataEdge 2019

Kay Ashaolu

Goal: Provide accurate DNA matching and ethnicity results

to users

My role as a Data Engineer

  • Built infrastructure to retrieve DNA info from lab
  • Engineered data pipeline that executes algorithms designed by Data Scientists at scale
  • Delivered results to downstream systems to display to users on Ancestry website

Extract Transform Load

The Data Scientist's role

  • Responsible for selecting the data from users DNA kits that are best suited for DNA matching and ethnicity
  • Found ways to clean the data from the lab to ensure it is highly usable
  • Explored the data to find new insights 
  • Find patterns to model the data based on those insights
  • Use models to refine algorithms that inform knowledge 

Obtain
Scrub
Explore

Model

Interpret

Division of Labor

Focus on obtaining the right data, exploring that data, and building models and algorithms that help inform new insights

Data Engineer

Focus on implementing models and algorithms developed from insights to run at scale in a timely, robust, and efficient manner

Data Scientist

Another Example: Kekoexchange

Tackling the mentorship problem

Goal: a better mentorship program for all

Research says effective mentorship relationships are:

  • Short coaching relationships
  • Specific to a single, measurable goal
  • Paired under a common domain
  • Within a trusted network

Engineer

Build application that implement research insights:

  • User interface to gather people domains of expertise and domains of growth
  • Data pipeline to pair people together
  • Resources to educate on finding a measurable goal

Scientist

Separate

Research

From Implementation

Questions?