Python 101 - Web Scraping
WifiPassword: LocKEtlYiNIG
Thanks to IDX Broker for hosting us.
https://www.idxbroker.com/
Seth Dudenhofer
sdudenhofer@gmail.com
t/f/i/l sdudenhofer
Needed Packages
import bs4 import requests import pandas as pd
Import your libraries
url = "http://planetpython.org" source = requests.get(url) print(source)
Use requests to grab the html file.
We are going to use Planet Python
We want to just display the links
soup = bs4.BeautifulSoup(source.content, 'lxml') soup_array = []
for link in soup.find_all('a'): data = link.get('href') soup_array.append(data) print(soup_array)
How can we view this in a better way?
df = pd.DataFrame(soup_array) print(df) df.to_csv('test.csv')
Other ways to format this?
By sdudenhofer
An introduction to web scraping in Python