Javier Garcia-Bernardo
SUNBELT, April 10th, 2016
@javiergb_com / @UvACORPNET
2.1. Fields missing
Employment Turnover Sector ID
2.2. Nodes missing
2.2. Nodes missing
2.2. Nodes missing
Observed average revenue
Company data quality
Many small companies are missing
3.1. Exploration
0-9 10-19 20-49 50-249 GE250
Interactive visualizations
3.1. Exploration
Code: https://github.com/uvacorpnet/interactive_visualizations
Our data is biased toward big companies
3.1. Exploration
Our data is biased toward big companies
Results in lack of correlation:
3.1. Exploration
3.2.1. Data follows lognormal distribution (loc and scale).
3.2.2. The lognormal distributions have constant scale.
3.2.3. Macro-economics to estimate location parameter.
3.2.4. Assess completeness
3.2. Explanation
3.2.1. Data follows lognormal distribution.
- Slope 1 relationship VAR[X] vs E[X] = constant scale
- Constant scale: Linear relationship between E[X] and location
3.2.2. The lognormal distributions have constant scale.
- Use macro-economic indicators to find average and location parameter
3.2.3. Macro-economics to estimate location parameter.
AVERAGE IN THE DATABASE
ESTIMATED AVERAGE
3.2.4. Assess completeness
- We have 1) observed average 2) estimated average.
- The relationship between both is proportional to completeness under reasonable assumptions
Company revenue
- We know which type of companies are missing.
- We know the directors associated to the type of companies that are missing.
- We can recreate companies and their directors and measure the impact on network measures (in progress).
We’re a multidisciplinary team, bringing together political science, computer science, network science, sociology, and based at the Amsterdam Insitute for Social Science Research.
We're hiring two PhD positions on corporate control and network analysis (see corpnet.uva.nl or @UvACORPNET for details)
Follow us on Twitter:
@javiergb_com
@UvACORPNET
Check our website: http://corpnet.uva.nl/
Javier Garcia-Bernardo
garcia@uva.nl