Role Playing
Annotation workshop

Introductions
- Who are we?
- What is this workshop about?
- Why is this a great workshop for you?
Agenda
- Exercise 1
- Discussion
- Exercise 2
Form a team of 3 participants and a Game Master
Clockwise from Game Master:
- The Client
- NLP engineer
- Annotator
Team up
slack #role-playing-annotation-workshop
Game dynamic
-
Your Game Master knows what's supposed to happen - ask them for help if you're lost
- The slides have context and scripts for you
- All the timers are links you can click into to help you keep track of time
- If there is no timer - read and play with the script and move to the next slide
Don't click ahead please
Business categories exercise
- What was the most frustrating part of the process for you?
-
What kind of communication difficulties have you experienced?
- How did it feel being a team member in this exercise?
- Have you experienced a similar dynamic in your project?
- Has anyone asked you if you were doing AI?
COME OUT OF CHARACTER AND DISCUSS IN TEAM
Whole Room Discussion
Is there something you discussed in your group that you'd like to share with everyone?
Annotators
Who should do the annotation?
Clients
What should the NLP engineer do if you want the impossible?
"Just give the data to the algorithm and it will give us insights. "
NLP Engineer
How do you solve the problem if you could start over?
Lessons learned
Scoping Summary
- Business involvement in annotation to negotiate scope
- NLP engineer's responsibility to flag unrealistic expectations
- Split problem into two separate streams: Text Categorisation and Information Extraction
Next: talk about tools
Annotation UI

Use regex to generate postcode suggestions
{"label":"POSTCODE","pattern":[{"IS_DIGIT":true,"LENGTH":{"==":5}}]}Use pre-trained model to generate Business Name, Owner suggestions
Business names are very similar to Named Entity type ORG in a pre-trained Named Entity Recognition (NER) model in spacy. Let's use it! {"label":"BUSINESS_NAME","pattern":[{"ENT_TYPE":"ORG"}]} Sometimes geographical names are also part of the name {"label":"BUSINESS_NAME","pattern":[{"ENT_TYPE":"GPE"}]}
Business owner is a particular case of a person
{"label":"BUSINESS_OWNER_PERSON","pattern":[{"ENT_TYPE":"PERSON"}]}
Tools summary
- Regex is the best AI
- Choose pre-trained models and generic labels
- Use an annotation UI to collect golden labels
Play it again, Sam…
Clockwise :
Wasn't it better the 2nd time round?
- Client getting stuck in
- Cross-team communication
- More flexibility on scope
- More intuitive & efficient tools
Resources
We hope you enjoyed our "Python Workshop for Humans"!
NYC annotation workshop
By Agata Sumowska
NYC annotation workshop
- 452