Snorkel
Weak Supervision for NLP Tasks
HK ML Meetup
July 2020
Overview
- Problem Space
- Weak Supervision
- Snorkel
- Demo
- Lessons Learned
- Questions (time permitting)
Use ML for Cost and Consistency
Regulations
Standardize
Legal Review & Enhance
Final Product
Weak Supervision
- Problem: For supervised ML, collecting labels can be extremely costly and/or prohibitively time consuming
- Can we somehow encode guidelines for labeling data, and rapidly apply them to large amounts of unlabeled data?
- Potential Solution: Weak Supervision - "noisy, limited, or imprecise sources are used to provide supervision signal for labeling large amounts of training data in a supervised learning setting"
Weak Supervision
Image Credit: Weak Supervision: A New Programming Paradigm for Machine Learning, Alex Ratner, http://ai.stanford.edu/blog/weak-supervision/
Weak Supervision
- In short: collect a bunch of "noisy" labels using low cost shortcuts and then sort out the problems that arise with this approach later (e.g., conflicts, overlaps)
- How to create noisy labels:
- Encode domain knowledge from experts as labeling "rule"
- Collect labels from mechanical turks / non-experts
- Use related information (e.g., knowledge bases) and some knowledge transfer to label
- Use specialized models for sub-tasks
Snorkel
- Python library with suite of tools to assist with weak supervision tasks; mostly focused on NLP
- Started by Alex Ratner while at Stanford University, has since grown into a very active open source project
- Used in industry[1] to great effect
Snorkel
- High level process:[1]
- Incorporate domain knowledge into labeling functions
- Resolve overlaps and conflicts with a label model
- Use weighted labels to train final model
Image Credit: Weak Supervision: A New Programming Paradigm for Machine Learning, Alex Ratner, http://ai.stanford.edu/blog/weak-supervision/
[1]: Check out this talk for a much more in depth explanation: https://www.datacouncil.ai/talks/accelerating-machine-learning-with-training-data-management and also this blog: http://ai.stanford.edu/blog/weak-supervision/
Snorkel: Demo[1]
- Problem: Classic NLP IMDB movie review sentiment; given review text, determine if review is positive or negative
- Twist:
- Let's Assume we only start with 1000 labels; will use as test set
- Will use Snorkel to create the rest of our labeled data
[1]: There are much more comprehensive tutorials on the Snorkel website: https://www.snorkel.org/use-cases/. This demo is meant to be a very cursory introduction to the functionality; if you would like to learn more check out the docs.
Snorkel: Demo
- Create a labeling function
- Apply labeling function to unlabeled data
- Iterate on labeling functions
- Create a label model to resolve overlaps and conflicts[1]
- Filter out any rows with no information
- Train classification model (with or without probability weighted labels)
- Follow along with the code here
[1]: How this is accomplished is quite interesting. For a detailed view check out section 4 of: Training Complex Models with Multi-Task Weak Supervision
Snorkel: Demo
- Other features
- Spacy Integration: can use NER, PoS tools to help build labels
- Transformation Functions: can create data augmentation functions to enhance data (e.g., synonym replacement)
- Sliced-based Learning: focus on subsets of classes / specific subproblems and weight importance
Lessons Learned
-
Overall: 👍 recommended, worth at least exploring if you have high cost labeling scenarios
-
Potential to be useful in low-data scenarios, establishing baselines, small performance boosts on existing models
-
Great to gain a deeper understanding of a new problem space and/or new data
-
Can use other models as labeling functions, can combine signals
-
Can be used to pull in new data modes to existing models (e.g., caption text for image)
-
Works well for multi-task / ancillary tasks
-
Works well in conjunction with active learning
Lessons Learned
-
Return on time investment has high variance, not a slam dunk; getting to a useful output usually requires many iterative cycles
-
Performance gains will depend on your size of unlabeled data, quality of label functions, and ability to incorporate weighted labels
-
Need to do some accounting for sub-class scenarios; don't want to skew the distributions with homogeneous labeling functions. Ideally LF are:
-
many (more than 20 is good), and diverse
-
mostly correct (50%+ accuracy), and conditionally independent
-
-
Doesn't work well with tasks such as NER, where you need context
-
Best to use in conjunction with other orthogonal methods
Summary
-
Weak supervision can be a useful tool in your ML toolkit, helping to lower the cost and reduce the time needed to collect labeled data
-
Snorkel is a well engineered, open source library that will help with the nuts and bolts of collecting noisy labels and augmenting your training data
-
You will get the most return on your time in scenarios where the problem space is new/novel, where expert knowledge is scarce / costly, or where there are large volumes of unlabeled data
Questions
References
HK ML Meetup: Snorkel & Weak Supervision
By smfullman
HK ML Meetup: Snorkel & Weak Supervision
HK ML Meetup: Snorkel & Weak Supervision July 2020
- 1,422