PyEugene

Python 101 - Web Scraping

WifiPassword: LocKEtlYiNIG

Thanks to IDX Broker for hosting us.

https://www.idxbroker.com/

Seth Dudenhofer

sdudenhofer@gmail.com

t/f/i/l sdudenhofer

A word of warning.

Needed Packages

import bs4
import requests
import pandas as pd

Import your libraries

url = "http://planetpython.org"
source = requests.get(url)
print(source)

Use requests to grab the html file.

We are going to use Planet Python

We want to just display the links

soup = bs4.BeautifulSoup(source.content, 'lxml')

soup_array = []

for link in soup.find_all('a'):
	data = link.get('href')
	soup_array.append(data)
print(soup_array)

How can we view this in a better way?

df = pd.DataFrame(soup_array)
print(df)
df.to_csv('test.csv')

Other ways to format this?

By sdudenhofer

An introduction to web scraping in Python