Role Playing

Annotation workshop

Introductions

  • Who are we?
  • What is this workshop about?
  • Why is this a great workshop for you?

Agenda

  • Exercise 1
  • Discussion
  • Exercise 2

Form a team of 3 participants and a Game Master

Clockwise from Game Master:

  • The Client
  • NLP engineer
  • Annotator

Team up

slack #role-playing-annotation-workshop  

nlrp.tech

Game dynamic

  • Your Game Master knows what's supposed to happen - ask them for help if you're lost

  • The slides have context and scripts for you
  • All the timers are links you can click into to help you keep track of time
  • If there is no timer - read and play with the script and move to the next slide

Don't click ahead please 

Business categories exercise

  • What was the most frustrating part of the process for you?
  • What kind of communication difficulties have you experienced?
  • How did it feel being a team member in this exercise?
  • Have you experienced a similar dynamic in your project?
  • Has anyone asked you if you were doing AI?

COME OUT OF CHARACTER AND DISCUSS IN TEAM

Whole Room Discussion

Is there something you discussed in your group that you'd like to share with everyone?

Annotators

Who should do the annotation?

Clients

What should the NLP engineer do if you want the impossible?

 

"Just give the data to the algorithm and it will give us insights.  "  

NLP Engineer

 

How do you solve the problem if you could start over?

 

 

Lessons learned

Scoping Summary

  • Business involvement in annotation to negotiate scope
  • NLP engineer's responsibility to flag unrealistic expectations
  • Split problem into two separate streams: Text Categorisation and Information Extraction

 

Next: talk about tools

Annotation UI

Use regex to generate postcode suggestions

{"label":"POSTCODE","pattern":[{"IS_DIGIT":true,"LENGTH":{"==":5}}]}
 

Use pre-trained model to generate Business Name, Owner suggestions

Business names are very similar to Named Entity type ORG in a pre-trained Named Entity Recognition (NER) model in spacy. Let's use it!

{"label":"BUSINESS_NAME","pattern":[{"ENT_TYPE":"ORG"}]}

Sometimes geographical names are also part of the name
{"label":"BUSINESS_NAME","pattern":[{"ENT_TYPE":"GPE"}]}

Business owner is a particular case of a person
{"label":"BUSINESS_OWNER_PERSON","pattern":[{"ENT_TYPE":"PERSON"}]}

Tools summary

  • Regex is the best AI
  • Choose pre-trained models and generic labels
  • Use an annotation UI to collect golden labels

Play it again, Sam…

Wasn't it better the 2nd time round?

  • Client getting stuck in
  • Cross-team communication
  • More flexibility on scope
  • More intuitive & efficient tools

Resources

We hope you enjoyed our "Python Workshop for Humans"!

 

NYC annotation workshop

By Agata Sumowska

NYC annotation workshop

  • 452