Data Scientist vs

Data Engineer:

What's The Difference?

DataEdge 2019

Kay Ashaolu

Goal: Provide accurate DNA matching and ethnicity results

to users

My role as a Data Engineer

  • Built infrastructure to retrieve DNA info from lab
  • Engineered data pipeline that executes algorithms designed by Data Scientists at scale
  • Delivered results to downstream systems to display to users on Ancestry website

Extract Transform Load

The Data Scientist's role

  • Responsible for selecting the data from users DNA kits that are best suited for DNA matching and ethnicity
  • Found ways to clean the data from the lab to ensure it is highly usable
  • Explored the data to find new insights 
  • Find patterns to model the data based on those insights
  • Use models to refine algorithms that inform knowledge 

Obtain
Scrub
Explore

Model

Interpret

Division of Labor

Focus on obtaining the right data, exploring that data, and building models and algorithms that help inform new insights

Data Engineer

Focus on implementing models and algorithms developed from insights to run at scale in a timely, robust, and efficient manner

Data Scientist

Another Example: Kekoexchange

Tackling the mentorship problem

Goal: a better mentorship program for all

Research says effective mentorship relationships are:

  • Short coaching relationships
  • Specific to a single, measurable goal
  • Paired under a common domain
  • Within a trusted network

Engineer

Build application that implement research insights:

  • User interface to gather people domains of expertise and domains of growth
  • Data pipeline to pair people together
  • Resources to educate on finding a measurable goal

Scientist

Separate

Research

From Implementation

Questions?

Data Scientist vs Data Engineer - What's the Difference?

By kayashaolu

Data Scientist vs Data Engineer - What's the Difference?

In this talk, we will demystify the sometimes perceived interchangeability between Data Scientists and Data Engineers. Both roles are distinct and critical to the success of any Big Data project. However, because there is a limited shared set of skills between the two fields, organizations and companies at times assign Data Scientists and Data Engineers the same tasks. This behavior can add significant risk to the success of any project. We will go over a few examples of how Data Scientists and Data Engineers work together to build a product.

  • 283