Data quality in Orbis
Javier Garcia-Bernardo
CORPNET Symposium on Economic Power, Jan 8th, 2018
@javiergb_com / @UvACORPNET
What's in Orbis
Data on companies:
~250 million companies.
~100 million companies with financial information.
Data connecting companies:
- Contacts
* Positions (223, 118 imp)
* Information on people (121 mi)
What's in Orbis
Data connecting companies:
- Ownership (~100 million)
What's in Orbis
- Ownership database
- Listed companies
- Orbis has more information on individuals owning shares.
Comparison with Thomson One
Missing data in Orbis
Missing fields
Missing nodes
Employment Turnover Sector ID
Missing fields
Missing nodes
Missing nodes
Missing nodes
ORBIS data (200 million companies)
Observed average revenue
How to know where
your data stands?
Exploration: Comparison with external sources
Explanation: Model the data
0-9 10-19 20-49 50-249 GE250
Exploration
Interactive visualizations
Code: https://github.com/uvacorpnet/interactive_visualizations
Exploration
Our data is biased toward big companies
- Higher GDP/capita ➙ Larger average companies
- Higher GDP/capita ➙ Higher quality
- Higher quality ➙ Smaller observed average (since we have the small ones)
Exploration
AVERAGE IN THE DATABASE
ESTIMATED AVERAGE
Explanation
Company revenue
- Information on low and middle income countries is terrible.
- Information on the US is not great (compared with other Western countries).
- In general there are always problems of comparability between countries.
- Different requirements or standards
- Positions / Sectors / Financials
- Capability to collect information
- Different requirements or standards
Drawbacks
Advantages
- The most complete database.
- Richer information on small and middle size companies.
- Okay(ish) information on real owners.
- Unique ID over many databases and efforts to deduplicate.
- Regarding comparability:
- The information on large companies (>10 M revenue) is good for all countries.
- For OECD countries you can compare it with aggregated statistics.
Symposium-2018
By Javier GB
Symposium-2018
- 1,110