Introduction to Machine Learning

For Designers, Engineers & Product Managers

by: preston parry

Today's Tour

  • What is ML?
    • ML is a huge field- where are we?
  • Regression Problems
  • Classification Problems
  • How do I need to structure my problem?
    • Data size
  • What technical knowledge do I need?
  • Next Steps

Introductions!

  • Name
  • Group
  • Corny Icebreaker
    • ​If you had to be an animal, what animal would you NOT be?​
  • ML Background
  • Ideas for using ML?

What is ML?

  • Advanced pattern recognition in datasets
  • Applying learned historical patterns to new data to get predictions
  • Bank lender analogy

ML use cases

  • Recommendations (see Mitchell & Padigender's latest article, and Raghav's creative use of Store2Vec)
  • Search
  • Image recognition
  • Autonomous vehicles
  • Playing video games
  • Customer segmentation
  • Outlier detection for signal processing (windfarms + factories + jet engines)
  • Speech processing
  • Unsupervised Learning
  • Deep Learning
  • NLP (Natural Language Processing)

Our use case: Analytics + Predictions

  • Analytics: understanding the complex relationships in our datasets to make better business decisions
  • Predictions: Imagine you had a chance to see into the future and know something about an order
  • Focused on standard classification & regression problems

Regression Problems

  • Predicting a numerical value
  • Predicting how much a house sold for
  • Values that can be measured in dollars or seconds or action counts or any other normal number from negative one billion to the number of stars in the universe

Classification Problems

  • Predicting a category (even a simple yes/no), or a probability
  • Predicting whether or not a house did sell
  • Anytime the outcome can be described as one of X things, you're categorizing the outcomes
  • Examples:
    • Types of support tickets
    • Cuisine category
    • Borrower category (subprime, average, prime, etc.)

Overlap

  • How much did the house sell for vs. Did the house sell for more than asking?
  • How long is the order going to take vs. Did the order arrive on time or not?

Structuring Problems for ML

  • Tabular data (data like a SQL table)
  • Each row represents a thing that you will want a prediction for
  • Each column represents some (hopefully) useful information about that thing you want a prediction on
  • Historical data for training a model
  • Must have the actual value for that historical data
    • You need the correct answer to train the machine
  • Ideally tens of thousands of rows or more

ML as an engineer

  • ML as an API
    • Conjurer vs. Scribe
    • Just another library
  • Define problem
  • Structure problem
  • Clean and filter and gather data
  • Feature engineering- largest time investment
  • Debugging
  • Engineering best practices
  • Consult experts (or the docs!) for the ML stuff you don't know yet
  • We'll cover the process in more depth next week

Technical skills needed

Basic Python

auto_ml

Using interfaces and APIs

Next Steps

  • Set up your ML environment
    • ​Data access
  • ​Think through problems you're working on where ML can be useful
  • Office hours- come chat!
  • Write a SQL query, toss the results into auto_ml
  • More training
  • More office hours to define, debug, and iterate on your ML projects

Future Training Sessions

  • Steps in the standard ML process
  • Live workshop with real data
  • ML Best Practices & Debugging
Made with Slides.com