Cyber Threat Intelligence Investigation
&
Cloud based Web Scraping

Hippolyte Quéré (Hippie)
Hippolyte Quéré (Hippie)

1.OSINT
2.Online storage website
3.Web scraping
1. OSINT
OSINT ?

Open
Source
INTelligence
2. Online storage website
Online storage website

Me

My friend

My needs
- Send huge files
- No registration
- I care about my personnal datas
- Free
anonfiles.com
- Free
- Anonymous
- 20GB file upload
- Unlimited bandwidth !!!!!!


Analyse the website
3. Scraping
Web scraping


❤️ podalirius.net
Legal disclaimer
- Legal as long as it is public information without personal datas
- No resale possible: Because copyright infringement of original data.
- In reality it is possible, but complicated
- Many special cases
Ethical web scraping
- APIs are often the best solution
- Respect the Robots.txt files
- Read the Terms and Conditions
- Identify yourself with a user-agent
- Respect the data
3. Scraping
Let's scrap !

10 characters long containing only upper and lower case letters and numbers

How long will it take ?

How long will it take ?

How long will it take ?

Quick math time
= 62^10(size)
= 839,299,365,868,340,224 (8,39.10^17)
100 000 000 -> 22.10 secondes
= 184,645,860,491.03484928 secondes
= 2137104 days 20 h 48m 11s
= 5 698years
3. Scraping
How does my scraper works ?

3. Scraping


Sum up of what you have to avoid
- Bypass the request rate based on UA, language, country, keywords ...
- IP banning
- Captchas
- Error handling
- changes in the detection system



Data results

Is Google lying ?
I was stuck on a result : lumendatabase.org/XXXX

Future improvements
- Creating a job list with multiple pre-generated dorks
- handle multiple slaves
- Data visualization of my results
Conclusion
Conclusion

The end

Rhackgondins ❤

Cyber Threat Intelligence Investigation & Cloud based Web Scraping
By hippie
Cyber Threat Intelligence Investigation & Cloud based Web Scraping
- 143