Data quality in Orbis
Javier Garcia-Bernardo
CORPNET Symposium on Economic Power, Jan 8th, 2018
@javiergb_com / @UvACORPNET
What's in Orbis
Data on companies:
~250 million companies.
~100 million companies with financial information.

Data connecting companies:
- Contacts
* Positions (223, 118 imp)
* Information on people (121 mi)

What's in Orbis
Data connecting companies:
- Ownership (~100 million)

What's in Orbis
- Ownership database
- Listed companies
- Orbis has more information on individuals owning shares.
Comparison with Thomson One

Missing data in Orbis
Missing fields
Missing nodes

Employment Turnover Sector ID
Missing fields

Missing nodes

Missing nodes

Missing nodes

ORBIS data (200 million companies)
Observed average revenue
How to know where
your data stands?
Exploration: Comparison with external sources
Explanation: Model the data

0-9 10-19 20-49 50-249 GE250
Exploration
Interactive visualizations

Code: https://github.com/uvacorpnet/interactive_visualizations
Exploration
Our data is biased toward big companies
- Higher GDP/capita ➙ Larger average companies
- Higher GDP/capita ➙ Higher quality
- Higher quality ➙ Smaller observed average (since we have the small ones)
Exploration


AVERAGE IN THE DATABASE
ESTIMATED AVERAGE
Explanation

Company revenue
- Information on low and middle income countries is terrible.
- Information on the US is not great (compared with other Western countries).
- In general there are always problems of comparability between countries.
	- Different requirements or standards
		- Positions / Sectors / Financials
 
- Capability to collect information
 
- Different requirements or standards
		
Drawbacks
Advantages
- The most complete database.
- Richer information on small and middle size companies.
- Okay(ish) information on real owners.
- Unique ID over many databases and efforts to deduplicate.
- Regarding comparability:
	- The information on large companies (>10 M revenue) is good for all countries.
- For OECD countries you can compare it with aggregated statistics.
 
Symposium-2018
By Javier GB
Symposium-2018
- 1,290
 
   
   
  