Big Data/Web-scraping Project at ECLAC, UN
The code base is available on my Github private repository.
Please share to me your Github account.
Database
Multi-threaded Crawler
Content Parser
Basic Data Cleaning
Normalization
Data File Output
Visualization
Web-scraper
Error Handling & Retry
De-dupe
Aggregation