Gensim

Gensim

Unsupervised embeddings

Vector + Similarity Is All You Need

Gensim

Unsupervised embeddings

Why to use?

Online

Fast

Robust

Production-ready

 

Gensim

Unsupervised embeddings

When to use?

Text classification

Sentiment analysis

Search engines

NER

Topic modeling

...

Open-source

Maintainers

- XX 2016

XX 2016 - May 2017

May 2017 - Feb 2019

Feb 2019 -

Radim Řehůřek

Lev Konstantinovskiy

Ivan Menshikh

Michael Penkov

Open-source

Why I'm here?

  1. Used gensim on work
  2. Feature requests / bug fixes to gensim
  3. "Lev, I want to change a job"
  4. ....
  5. PROFIT, you are fired hired!

Open-source

Core developers

Open-source

Core developers

Students

NMF

FastText (cython)

gensim-data

FastText (python)

TM Viz

Corpusfile

ATM

Doc

Open-source

Core developers

Students

Radom contributors

@544895340
@AMR
@AadityaJ
@Alexjmsherman
@AustenLamacraft
@CLearERR
@Cheukting
@DennisChen0307
@ELind77
@ElSaico
@Fil

@HodorTheCoder
@IrinaGoloshchapova
@Jayantj
@JonathanHourany
@Karamax
@KenjiOhtsuka
@KiddoZhu
@KokuKUSIAKU
@Kreiswolke
@LShostenko
@Laubeee

@MridulS
@MritunjayMohitesh
@PeteBleackley
@PeterHamilton
@RishabGoel
@RunHorst
@SamriddhiJain
@Shiki
@Stamenov
@Stigjb
@TheFlash10

@Utkarsh
@VorontsovIE
@Witiko
@Xinyi2016
@Zohaggie
@abhinavchawla
@accraze
@ajkl
@akarazeev
@akutuzov
@alantian

@alexgarel
@allenyllee
@andrewjlm
@aneesh
@anmol01gulat
@anmolgulati
@anujkhare
@aquatiko
@arlenk
@arttii
@bahbbc

and many others

Project structure

Community

  • Github
    • Feature requests
    • Bug reports
    • Holywars
  • Twitter
    • Announces
    • Short discussions
  • Gitter
    • Chat
    • Awful, really
    • Infinite context-switch

 

Maintainer?

Maintainer?

Expectation

Maintainer?

Expectation

Reality

Maintainer?

Goal: improve project

  • Support
  • Code-review & Merge
  • Releases
  • Roadmap
  • Anything that nobody want to do
    • Setup env, CI, checkers, etc
    • Guides, documentation
    • Coordinates an contributors
  • Sometimes (never) add a new features

 

What's most important?

  • Support project in nice state (backward compatibility, no useless stuff)
  • Documentation (always a probem)
  • Communicate, no, you don't get it, COMMUNICATE
  • Attract new contributors
  • Love open-source

What's next with gensim?

  • Project in "slow maintenance mode" until ¯\_(ツ)_/¯​
    • Bugfixes / documentation improvements
    • Code cleanup
    • No roadmap
    • No GSoC 2019
    • No student incubator
    • No awesome features planned

When do you implement a model X ?

ULMFit, BERT, LASER, etc ...

Most likely never

How can I help?

  • Fix any bug from issues
  • Improve documentation
  • Ask @mpenkov for advice

Thanks!

Take a free sticker here  

gensim-oss-mlekb

By Ivan Menshikh

gensim-oss-mlekb

  • 1,175