Levelling Up Your Web Scraping Game

NomadPHP US June 2021

We're Hiring!

What we'll cover

Principle 1: It's all about the data

That data gets to a normal user over HTTPS (at least in most cases)

Principle 2: Choose runtime tradeoffs wisely

PHP

Headless Browser

Principle 3: Try to hit edge cases during dev

Tool #1: Firefox

Demo #1: LocalCallingGuide

Demo #2: Hippo

...but you probably won't use an interactive CLI

Save/restore cookies between requests

You may need to look at what gets stored in localStorage/sessionStorage.

If you need to use a real browser...

SOmetimes sites don't want you to scrape

Sometimes your data isn't in HTML/JS/XML

CFP closes Wednesday night

Thanks! Questions?

By Ian Littman

Levelling Up Your Web Scraping Game - NomadPHP June 2021

Levelling Up Your Web Scraping Game

NomadPHP US June 2021

We're Hiring!

What we'll cover

Principle 1: It's all about the data

Principle 2: Choose runtime tradeoffs wisely

PHP

Headless Browser

Principle 3: Try to hit edge cases during dev

Tool #1: Firefox

Demo #1: LocalCallingGuide

Demo #2: Hippo

...but you probably won't use an interactive CLI

Save/restore cookies between requests

If you need to use a real browser...

SOmetimes sites don't want you to scrape

Sometimes your data isn't in HTML/JS/XML

CFP closes Wednesday night

Thanks! Questions?

More from Ian Littman