Trang Le, Daniel Himmelstein, Ariel Hippen Anderson, Matthew Gazzara, Casey Greene*
University of Pennsylvania
Virtual ISMB 2020
Recorded by 2020-06-30
~ 34,000 corresponding authors
412 society honorees
first name → genderize.io → probability of male/female
{first_name} {probability_male} david 0.99 sarah 0.02 avery 0.67
English Wikipedia 2019 category of Living People
~ 700,000 name-country pairs
LSTM 3-grams
probability of name originating from each of 10 NamePrism country groups
NamePrism arXiv:1708.07903
East Asian names are NOT often mistaken as Greek names.
The classifier is more prone to mistaking South Asian names as Celtic/English.
pubmedpy + speaker bio.
{wru} to estimate race and ethnicity of US-affiliated scientists
enrichment analysis to assess affiliation over/under-representation
~ 30,000 papers, 412 honorees, 700,000 name-nationality pairs
2020
prob(having an East Asian name) ~ 33%
Societies can design policies to honor scientists in ways that counter these biases.
Casey Greene @greenescientist
Daniel Himmelstein @dhimmel
Ariel Hippen Anderson @AHippenAnderson
Matthew Gazzara @MR_Gazzara
Needhi Bhalla @NeedhiBhalla
Iddo Friedberg @iddux
Anonymous reviewers
Gordon and Betty Moore Foundation
Team
Feedback
Funding
Slides: slides.com/trang1618/iscb-diversity/
Manuscript: greenelab.github.io/iscb-diversity-manuscript/
Code:
@trang1618
By Trang Le
Talk at Virtual ISMB 2020
#math graduate. Postdoc fellow with Jason Moore.