Hate Speech Detection in an Indian Context
Anjali Bhavan
What is Hate Speech?
- Multiple definitions by authors
- Widely accepted definitions?
- Protected categories covered under hate speech
- What factors are considered while defining hate speech?
Why this topic for my thesis?
- West-centric research mostly - majority of academic publishing, industrial advancements
- Colonialism in AI
- India-specific issues when it comes to hate speech detection
Issues with Western focus in ML R&D
- Most popular internet applications developed in US/Europe
- narrow intended audience
- lack of resources for other cultures/countries
- similar situation in academic research
- Perspective matters in hate speech - who gets a say?
- Depends on who holds power in global landscape
- Easy to dictate terms when you're the default arbiter of everything
- Highly subjective nature of hate speech
India-specific issues
- Consequences of Western orientation:
- significant population without English-medium education
- other factors of discrimination
- Casteist hate speech/caste as a protected category
- What is caste? Why does it matter?
- Sociopolitical landscape: rise in communal violence, hate crimes
- Code-mixed language/multilinguility
Other issues
- Instances of racist bias in hate speech models: recent work demonstrating bias against African-American English in models
- Lack of non-English data: serious setback
What I hope to do
- Study how hate speech models built with a US-centric idea of hate and racism function in a non-Anglophone context
- Specifically how do they work with Indian data?
- Evaluation/auditing: running state-of-the art hate speech detection models on Indian data and analyzing performance
- A thorough literature review
What I hope to do (cont.)
- Write about categories in hate speech: extreme speech, dangerous speech, fear speech etc.
- (Misc.) A commentary on caste in computing (particularly casteist speech), how it manifests on social media: linguistic markers etc.
- Some more focus on WhatsApp and its part in spreading inflammatory, hateful content and instigating communal violence in India
Some takeaways
- A conversation about hate speech is a conversation about society.
- It is also a conversation about the concentration of power - who gets to express their sentiments, who gets to air their hateful opinions
- Subjectivity: a lot of nuances to hate speech, which we can barely begin to capture
Hate Speech Detection in an Indian Context
By Anjali Bhavan
Hate Speech Detection in an Indian Context
- 582