Hate Speech Detection in an Indian Context
Anjali Bhavan
What is Hate Speech?
- Multiple definitions by authors
- Widely accepted definitions?
- Protected categories covered under hate speech
- What factors are considered while defining hate speech?
Why this topic for my thesis?
- West-centric research mostly - majority of academic publishing, industrial advancements
- Colonialism in AI
- India-specific issues when it comes to hate speech detection
Issues with Western focus in ML R&D
- Most popular internet applications developed in US/Europe
- narrow intended audience
- lack of resources for other cultures/countries
- similar situation in academic research
- Perspective matters in hate speech - who gets a say?
- Depends on who holds power in global landscape
- Easy to dictate terms when you're the default arbiter of everything
- Highly subjective nature of hate speech
India-specific issues
- Consequences of Western orientation:
- significant population without English-medium education
- other factors of discrimination
- Casteist hate speech/caste as a protected category
- What is caste? Why does it matter?
- Sociopolitical landscape: rise in communal violence, hate crimes
- Code-mixed language/multilinguility
Other issues
- Instances of racist bias in hate speech models: recent work demonstrating bias against African-American English in models
- Lack of non-English data: serious setback
What I hope to do
- Study how hate speech models built with a US-centric idea of hate and racism function in a non-Anglophone context
- Specifically how do they work with Indian data?
- Evaluation/auditing: running state-of-the art hate speech detection models on Indian data and analyzing performance
- A thorough literature review
What I hope to do (cont.)
- Write about categories in hate speech: extreme speech, dangerous speech, fear speech etc.
- (Misc.) A commentary on caste in computing (particularly casteist speech), how it manifests on social media: linguistic markers etc.
- Some more focus on WhatsApp and its part in spreading inflammatory, hateful content and instigating communal violence in India
Some takeaways
- A conversation about hate speech is a conversation about society.
- It is also a conversation about the concentration of power - who gets to express their sentiments, who gets to air their hateful opinions
- Subjectivity: a lot of nuances to hate speech, which we can barely begin to capture
Hate Speech Detection in an Indian Context Anjali Bhavan
Hate Speech Detection in an Indian Context
By Anjali Bhavan
Hate Speech Detection in an Indian Context
- 634