Parsing HTML with Rust

The story: we need a simple workshop

Options

  • A game
  • A website
  • A parser

Options

  • A game
  • A website
  • A parser

Possibility #1

1.a.: Figure out how to perform a GET with Rust

Hyper

https://github.com/hyperium/hyper

1.b.: Print a response status

1.c.: Print the page body content

fn main() {
    let client = Client::new();
    let mut response = client.
get("url").
        header(Connection::close()).send().unwrap();
    let mut body = String::new();
    response.read_to_string(&mut body).unwrap();
    println!("Body:   {}", body);
}

1.d.: Save body into a local HTML file

pub fn write(path: &Path, body: &str) -> io::Result<()> {
    let mut f = try!(File::create(path));
    f.write_all(body.as_bytes())
}

1.e. Use select.rs

https://github.com/utkarshkukreti/select.rs

What we can learn

  • Iterator
  • Struct | impl
  • get the right modules

Possibility #2

Same as #1, except that we could use scraper

Scraper

https://github.com/programble/scraper

Possibility #3 - From Scratch

 

Which one is your favourite? :)

Thanks!