Quantitative Community Management
Asheesh Laroia
Executive Director, OpenHatch
About me
- 2000: DeCSS
- 2001: Read GNU Manifesto
- 2001: Seth David Schoen
- 2006: Met him
- 2007: Concluded the community is too small
- 2009: Founded OpenHatch
Topic: Who are we,
as a community?
FLOSS survey, 2001
Rishab Aiyer Ghosh
Rüdiger Glott
Bernhard Krieger
Gregorio Robles
International Institute of Infonomics, Maastricht
Gender stats
1.1% women
in FLOSS survey
1.6% women
in separate FLOSS-US survey
Survey methodology
“Rather than selecting out a small, well-controlled sample...
we allowed respondents to decide for themselves whether they should be considered “developers”..."
"Our goal has been to analyze the entire... community."
Topic: What are our projects like, on the whole?
"Who Writes Linux?" report
- Yearly from the Linux Fondation,
these numbers re: 2.6.30
- Changes per hour: 6
- # of lines: 11 million
- # of developers: 1,150
- # of companies: 240
All SourceForge Projects (n=145,850)
“Mature” and “Production” SourceForge Projects (n=29,821)
SF.net Projects Downloaded >=99 times (90th %ile)
Scratch projects 1+ year after publication (n=249,428)
Google Code Projects (n=195,834)
Active Google Code Projects (n=74,398)
Github public projects (developers are “watchers”) (n=265,088)
Radical flamebait questions
"Does Ghosh's survey find fewer women because it mostly surveyed people who start projects?"
"Are the men in FLOSS and the women generally using separate hosting services?"
"Are women under-represented because, as a group, they were less likely to fill out the survey?"
Reflections: What are we measuring, and why?
- Academic factoids
- Not actionable
- Being measured by people who don't have an interest in the results.
Radical flamebait conclusion
"Opt-in surveys are hopelessly broken,
unless you know, very clearly,
who has responded and who did not."
- Benjamin Mako Hill
Radical counter-flamebait:
± 50% is good
enough for activists
Radical counter-flamebait:
± 50% is good
enough for activists
But do we know it's +/- 50%?
Radical counter-flamebait:
± 50% is good
enough for activists
How do we measure progress?
Going forward,
let's try to
be useful.
2008 Wikipedia survey
- For 1 week, a link on top of every page
(I don't remember seeing it...)
-
Goals of survey: Answer...
Why do people start+stop editing?
Do people know WMF is a non-profit?
What are Wikipedia editors' demographics?
- Collaboration between WMF and UNU-MERIT
Basic demographics
Age (overall)
- 25% younger than 18
- 50% younger than 22
Gender
- Readers: 31% female, 69% male
- Editors: 13% female, 87% male
Wikipedia Editor Survey, 2011
- "The first ever semi-annual survey of
Wikipedia editors"
- "conducted on Wikipedia and presented
to logged-in users"
- Results: 8.5% female.
- Is it getting worse?
- Will we ever know?
comScore vs.
UNU-MERIT
UNU-MERIT: 26% Russian
comScore: 2.5% Russian
Pew Survey, 2010
Goal: understand Internet use
and adoption in the United States
Method: Call random USians over 18
Results: % of US (not % of WP)
Afterward: Publish everything
Pew's Wikipedia demographics
Age
- 18-29: 62%
- 30-49: 52%
- 50-64: 49%
- 65+: 33%
Pew vs. UNU-MERIT
Gender (UNU-MERIT)
- Readers: 31% female, 69% male
Gender (Pew)
- Readers: 47% female, 53% male
Other discrepancies
- Age, marital status, education level, ...
Data recovery
- Adjust response data to match Pew demographics, using logistic "propensity score" to model non-random selection.
- Female editors: 12.7% => 16.1%
- US female editors: 17.8% => 22.7%
- Credit: Benj. Mako Hill and Aaron Shaw
(Search: [hill shaw gender wikipedia pew])
What they say
vs.
What they do
- Wikipedia editor survey 2011:
- 70% say receiving a Barnstar
makes them more likely to edit.
- Shaw & Hill, 2012 (Shaw dissertation)
- Measure edit rate changes over
5 weeks pre and post
- Net -1.72 edits per week change
- "Movers": +3
- "Non-movers": -5
- Search: [shaw shaw interactional
account dissertation]
Topic: wikiHow demographics and more
- Inspired (and shocked) by Wikipedia
Editor Survey results
- Wondered if they had the same
lack of gender diversity
- Ran a survey!
Survey methodology
- Over three weeks, find active users
- Send them a talk page message
- ~50% response rate; N=126
- Sent by the wikiHow community manager
wikiHow demographics
- 56% of respondents were female.
- 52% are 15 or younger.
24% are 16-25.
- The older the contributor, the
more likely to be male.
How to increase
data quality
- Ask readers to fill out the same survey.
- Adjust editor response rate by
readers' response/non-response
proportions.
Questions about
wikiHow data
- 50% of survey respondents under 15?
- Or 50% of age respondents under 15?
- Was gender mandatory to fill in?
- Which editing levels were more/less
likely to respond?
Questions about
wikiHow data
- 19/123 did not fill out age
- Gender was required
- (did people refuse to answer
because of that?)
-
Which editing levels were more/less
likely to respond?
Topic: Why do Thunderbird contributors give back?
Topic: Behavioral studies
GNOME Women's Outreach Project
(or, "The first great FLOSS behavioral study")
GNOME Women's Outreach Project
GSOC 2006: 181 applicants
Women's Summer Outreach Program,
Started by Hanna Wallach and Chris Ball:
100 applicants
Structure: Separate funding,
same model as GSoC:
mentored coding internship
Conclusion: Targeted outreach
changes the behavior we see!
GNOME Women's Outreach Project
Open questions:
- Do Women's Outreach Project participants stick around in GNOME similarly to other summer interns?
Maybe more, maybe less?
- Answer may lie in Kevin Carillo's Ph.D. thesis
- but opt-in nature makes that hard
A hypothetical
behavioral study
- Select 200 random users
- Find out their demographic info
- Watch their activity levels
- (this is hypothetical for now)
2010:
Open Source
Comes to Campus
- ~30% of applicants were women
- No gender-specific outreach
- Great 2-day event...
- ...but we did leave an impact?
Tracking
Open Source
Comes to Campus
- Compare Github activity against
other CS students who did not
attend event
It worked in Boston
Clones popping up:
- PyStar Philly
- RailsBridge Boston
- Chicago Python Workshop
- Columbus Python Workshop
- Beginners & Friends Python Programming Workshop
in Auckland, NZ (hi Tim McNamara!)
Tying them together as
OpenHatch Affiliated Events
Limitations of
$CITY Python Workshop for women + friends
- Major urban areas, only?
- Only applies if you can hijack an existing user group
Changes to Open Source Comes to Campus
- Work with existing CS club
(ACM, Women in CS, etc.)
- Use exit survey to improve
event
- Plans to check back in
with attendees
Open Source Comes to Campus survey notes
- Gender as a text field
has 100% response rate
- Undergrads really don't
know git (:
Topic: Project-driven contributor metric tracking
Meego community health
- 2011: Dave Neary and Dawn Foster
- Goal: Illuminate community activity:
Bugzilla, mailing lists submissions, wiki edits
-
http://wiki.meego.com/Metrics/Dashboard
- A thrilling ball of Tomcat, Pentaho, and MySQL
Wikipedia bot messages
(or, "Does niceness matter?")
Huggle!
N approx. 10,000
Wikipedia bot messages
"Changing the tone and language of the generic vandalism warning..."
- increasing the personalization (active voice rather than passive, explicitly stating that the sender of the warning is also a volunteer editor, including an explicit invitation to contact them with questions)
- decreasing the number of directives and links
- and decreasing the length of the message;
...led to more users editing articles in the short term
Wikipedia bot messages
Being too "nice" can backfire:
9.6% of editors who received the new version edited in the file namespace at all afterwards.
For the default, 18.6% went on to make edits to files.
Nice != Vague
MediaWiki community health
- "What are the areas with more activity?"
- Are we expanding or shrinking?
MediaWiki community health
Measure everything
Debian mentorship, 2009:
"Four days"
- Can we review new contributors'
packages within four days?
if so, they know what to expect.
- Package review increased sharply at the start...
- and then flatlined to its old amount.
- Follow through is hard.
Ubuntu Developer Advisory Team
"This team in terms of UbuntuDevelopment, tries to fulfill the following tasks in the Ubuntu world:
- Reach out to new contributors, thank them for their work and get feedback.
Reach out to people who might be ready to apply for upload rights and help them.
Reach out to contributors that went inactive and get feedback from them and offer help."
(Source: their
homepage, last edited 2012-04-02)
New Contributor Report
- DAT asked open-ended questions; 63% response rate
- 9 love launchpad; 9 dislike it
- Reviews are "surprisingly painless"
- Docs are troublesome: “overwhelmed at all the information” and by "contradictory information" that is "difficult to follow in a logical manner"
- Contributing is "a surprisingly painless process"
Ubuntu Developer Advisory Team
The real magic is in Trello cards
Data from Ultimate Debian Database
General approach: Make people happy
rather than tell them what to do
Trello "demo"
(whiteboard)
FLOSS is metrics-poor
- Mirrors make it hard to count Debian users.
- Web app authors are privacy-sensitive.
- Follow-through is hard for volunteers.
- Four days, in Debian
- do you read your web analytics?
OpenHatch "greenhouse":
Ubuntu DAT clone
- First: Port to Debian
- Then: Create a control group
- Finally: Make generic
- GSoC student:
David Lu
Six months of meta-organizing
GSoC meta mentorship
(pipe dream)
- Question: What makes GSoC better?
- Sub-question: what does a good GSoC mean?
- More failed students!
- Are students still active 3-6 months later?
- Happy mentors.
GSoC meta mentorship
(pipe dream)
- Theory:
- mentors would benefit from being
in touch with each other
- mentors would benefit from being
asked to report on status
- Test: Create opt-in meta-mentorship
- ENOSPC
Thanks
- Benjamin Mako Hill, for graphs (and FLOSSmole for the source data)
- Ubuntu DAT for giving me access
- Sarah Mei for slide piracy
Other resources
Stay in touch
- asheesh@openhatch.org
- http://lists.openhatch.org/events
-
http://www.rvl.io/paulproteus/lca/
- Sponsor us
Do something