Bots to Combat Casual Sexism at Work

Representation of Women in Tech

  • National Center for Women & Information Technology (NCWIT), 25% of the computing workforce was female in 2015.

  • Hiring Across Technical roles: It’s an imbalance that’s reflected in company diversity reports. Microsoft, for example, boasts 29% female workers across its staff, but in technical positions only 17% are women. Of Google’s senior management and executive officer team, 17 are male while only three are women. Men make up 83% of Google’s engineering staff; Apple’s technical team is 80% male

The Stats say it all!

Sexism at the Workplace: The uncomfortable truth

  • Four in ten women in the United States have faced gender discrimination in their workplace. Sexism at the workplace is a global problem.
  • Casual Sexism-What is that?
  • But the masculine skew allows casual sexism to go unchecked, exemplified by ill-advised comments from certain chief executives. It’s so subtle that you’re often left wondering if it really happened, and when you try to talk about it, you sound like a crazy person because it’s so small – but it’s death by a thousand paper cuts

Solution?

Global Level:

  • Create awareness in your company. 
  • Conduct Meetups and talks that center around women and their contributions

Personal Level: 

  • Join Women in Tech organizations
  • Discuss and talk about this to other female co-workers 
  • Use your strengths- Technology as a weapon

Building Bots to Combat Sexism

  • Bot Flow
  • Get Data
  • Bot Flow-Preprocessing the data
  • Understanding Workplace Conversations: A Brief Introduction to Word Embeddings
  • Model Training

  • Integrating with the Slack Platform 

  • Future Work
  • Toxic Comment on Reddit
  • How can you contribute?

Bot Flow

Get Data

  • Deep Pavlov Data-Slack bot
  • Created a few of our own sexist remarks that are commonly used in the workplace
  • Toxic Comment Classification Challenge Kaggle: Link
  • Removing Stopwords 
  • Removing Punctuations
  • Tokenising the sentences
  • Conversion into numbers

Understanding Workplace Conversations: A Brief Introduction to Word Embeddings

  • Capturing Conversations- 24 percent of conversations in workplaces are written communications. Hence can be analyzed and monitored
  • Word Embeddings: Word Embeddings are Word converted into numbers 
  • Humans can deal with text format quite intuitively whereas computers cannot
  • A computer can match two strings and tell you whether they are same or not. How do you make a computer understand that “Apple” in “Apple is a tasty fruit” is a fruit that can be eaten and not a company?
  • Conversion of the sentences to a dictionary

Understanding Workplace Conversations: A Brief Introduction to Word Embeddings

  • Frequency-based word embeddings
    • Count Vector: occurrence of a word in a single document
    • TF-IDF Vector: takes into account the entire corpus
    • Co-Occurrence Vector: Similar words tend to occur together and will have similar context for example – Apple is a fruit. Mango is a fruit.
  • Prediction based learning-
    • Language Model
    • Word2Vec: A toolkit that allows the seamless training and use of pre-trained embeddings
      • ​​CBOW
      • Skip-gram

Understanding Workplace Conversations: A Brief Introduction to Word Embeddings

Language Modelling:  Bengio's Work in 2003

consists of a one-hidden-layer feed-forward neural network that predicts the next word in a sequence

  • Word2Vec: In 2013,  Mikolov et al.proposed two architectures for learning word embeddings that are computationally less expensive than previous models

    • CBOW: Continuous Bag of Words uses surrounding words to predict the centre word

    • Skip Gram: uses the center word to predict the surrounding words

  • TSNE Plots using Tensorflow: Link

Understanding Workplace Conversations: A Brief Introduction to Word Embeddings

Model Training

I

N

P

U

T

W

O

R

D

2

V

E

C

L

S

T

M

 

F

C

N

 

 

 

 

O

U

T

P

U

T

Integrating with the Slack Platform

  • Function using Slack RTM API: Parses a list of events coming from the Slack RTM API to find bot commands. If a bot command is found, this function returns a tuple of command and channel.
  • Documentation for using bots on slack: Link

Further Work

  • Logging to catch repeated offenders: The bot will log details of the repeat offenders and will send a message to the workspace admin. Regular offenders will also be shown and incremental 'testosterone level' value
  • Moving it into more workplace conversation apps like Microsoft Teams, Google Hangouts,   
  • Audio Analysis: If the data gathered is significant enough, we can move this to audio/speech recognition of sexist comments. For example, this bot could be integrated with existing virtual assistants to give out warnings if a sexist remark is heard in a meeting.

Reddit Bot to Analyse Toxic Data

  • The objective of the bot: Raise awareness of the toxic remarks in day-to-day forums used by people
  • Easily available toxic comment data
  • Used the same preprocessing steps
  • Used a classification model instead of LSTM
  • Future Work: Bots similar to Reddit bots: Analyzing the toxic nature of other social media platforms: Twitter, Forum channels

How can you contribute?

  • Beginner: Use the code in github for toxic comment and build it for reddit channels

  • Create Analysis of your reports: What should people learn from it?

  • Please feel free to push any sort of data or remark you find suitable for this project. I believe that sexism in workplace conversations is a global problem. I hope that this bot will be a step towards eliminating it. Therefore, I need crowdsourced to improve the model and make it more usable

  • Crowdsourcing data

  • Unfortunately, there are no proper datasets for casual sexist remarks at workplaces. The available datasets that I have come across are extremely vulgar/obscene. However, sexist remarks in workplaces are often subtle or contain some usual phrases like "Girls are like that".

Acknowledgments

  • Soham Chatterjee-For contributing and helping with this project
  • Adam Shamsudeen- For helping me research data and finding the deep pavlov data

  • Saama Technologies

Contact Details

  • Feedback- A link to give me a feedback- https://tinyurl.com/bots-to-combat-sexism
  • I would love to hear from you and your responses will be Anonymous!
  • You can get a link to the slides right after filling the feedback form!
  • Contact us:
  • Archana Iyer:
    • varchanaiyer139@gmail.com

deck

By archana iyer

deck

  • 682