Wikipedia and

Wikidata access with Python

Toni Hermoso Pulido

@toniher

Amical Wikimedia

Wiki

Wikipedia

Wikimedia

MediaWiki

Wiki CMS

Built mostly in:

  • PHP
  • JavaScript

Working with Docker

Docker images used

We build the one above following README instructions, modifying Bash scripts and Dockerfile if desired

We can build this one as well. Alternately we can reuse it from:

https://hub.docker.com/r/toniher/debian-python-mediawiki/

However you need to mount example scripts provided in the repo above

Bots

The hidden ones

MediaWiki API

Accessing MediaWiki programatically

MediaWiki API clients / libraries

  • Python (Mwclient, Pywikibot)

  • Perl (MediaWiki::API, MediaWiki::Bot)

  • JavaScript (nodemw)

  • etc.

Mwclient

A MediaWiki API Python client library

RESTBase

A sophisticated Wikipedia REST API

Wikidata

Wikidata: Namespaces

  • «Wiki ones»: User, Wikidata, Category, etc.
  • Wikibase ones: Item, Property (both entities)

Wikidata: Concepts

  • Labels
  • Descriptions
  • Aliases
  • Sitelinks
  • Statements

Wikidata: Concepts

Wikidata: Statements

RDF

(Resource Description Framework)

Wikidata API access

 

MediaWiki API access: Like any other MediaWiki.

https://wikidata.org/w/api.php

 

Access via JSON (or other) exports

SPARQL endpoint access

Wikidata API clients

Linked (Open) Data

SPARQL: Endpoints and interfaces

Wikidata SPARQL inteface

Interface to multiple endpoints

Yasgui

SPARQL: API Access

Example for Wikidata: https://query.wikidata.org/sparql

SPARQL client in Python

Wikimedia Hackathon 2018

#wmhack

18-20 May