How to Get IP Banned Quick

AKA Web Scraping the Workshop

Julie Cover

What we will do

  • Quick recap of HTML and how that works
  • Web Scraping with Python, Requests, and BS4
    • Because Python's the best
  • Build a web scraping tool to scrape the CodeLabs Project Site - https://labs.codeday.org/gallery

HTML

  • Tags
  • Attributes
  • Nested Content
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Title</title>
</head>
<body>

    <h1 id="header">My First Heading</h1>
    <p class="par">
        My first paragraph.
        <img src="johnpeter.jpg">
    </p>

</body>
</html>

Slides Done! Time for Code

Please look up Google Colab and create a new Notebook

https://colab.research.google.com/

What to do next

  • The Sky's the Limit!
  • Learn about xpath - the solution to automatic class names
  • Please don't scrape Wikipedia
    • Causes server strain, they do things for free
    • They have an API, so use that instead
  • Getting IP Banned

How to Get IP Banned Quick

By Julie

How to Get IP Banned Quick

  • 82