Python 103
Web Programming Course
SUT • Fall 2018
Outline
-
Files
-
Modules and Packages
-
Reading Web Pages
- Regular Expressions
- Python and Regexes
-
Exception Handling
-
Python Web Server
Files
Opening Files
-
We can open a file for reading/writing using open
function
file = open('test.txt', 'w')
file.write('Hi!\n')
file.write('This is a test.\n')
file.close()
with open('test.txt', 'w') as file:
file.write('Hi!\n')
file.write('This is a test.\n')
class controlled_execution:
def __enter__(self):
set things up
return thing
def __exit__(self, type, value, traceback):
tear things down
Reading Files
- Files handles are iterable
-
We can use handles to read files line by line
file = open('test.txt', 'r')
for line in file:
print(line.rstrip())
file.close()
Reading Files at Once
-
We can read the entire content of a file:
-
as a single string by read, or
-
as a list of strings by readlines
-
in[0]: open('test.txt').read()
out[1]: 'Hi!\nThis is a test.\n'
in[1]: open('test.txt').readlines()
out[1]: ['Hi!\n', 'This is a test.\n']
Modules and Packages
Modules
# mymodule.py
def foo():
pass
bar = 10
# test.py
import mymodule
mymodule.foo()
- A module is a file containing Python definitions
and statements to be used in other Python
programs
Importing Modules
-
There are three different ways to import a
module
import math
math.pi
from math import pi, cos
cos(pi)
import math as m
m.pi
Packages
- We can organize modules inside packages, and access them via dot notation
- A package is simply a directory containing an
(empty) __init__.py file
App/
__init__.py
test.py
Tools/
__init__.py
utils.py
mytools.py
from App.Tools import utils
Reading Web Pages
Retrieve a Page
- We can use urlretrieve function to download any
kind of content from the Internet - The function is located in request module in urllib
package
from urllib.request import urlretrieve
url = 'http://ce.sharif.edu/courses'
file_name = 'courses.html'
urlretrieve(url, file_name)
Opening a Socket
- We can alternatively open a socket to fetch the
remote file - The socket object is pretty like the file handle
from urllib.request import urlopen
url = 'http://ce.sharif.edu/courses'
socket = urlopen(url)
text = str(url.readall())
socket.close()
Request
-
Making a request with Requests is very simple.
>>>import requests
>>> r = requests.get('https://api.github.com/user', auth=('user', 'pass'))
>>> r.status_code
200
>>> r.headers['content-type']
'application/json; charset=utf8'
>>> r.encoding
'utf-8'
>>> r.text
u'{"type":"User"...'
>>> r.json()
{u'private_gists': 419, u'total_private_repos': 77, ...}
Regular Expressions
Regular Expressions
- A regular expression (aka regex or regexp) is a
sequence of characters that forms a search
pattern - Python supports regexes through the standard
library re module -
meta characters
-
. ^ $ * + ? { } [ ] \ | ( )
-
import re
m = re.match('me', 'meanwhile')
if m is not None:
print(m.group())
Regular Expression Syntax
- Regular expressions are strings containing text
and special characters (such as ? and *) that
describe a pattern - The choice | operator creates a regular
expression that matches one of two things
if re.match('Ali|Hamid', user):
// user is valid
Character Classes
- The character class operator [] allows to match
any character within the class- [abcd] is equivalent to a|b|c|d
- We can use a range of characters within a class
- [a-f] is equivalent to [abcdef]
- We can also reverse a class using ^ operator
- [^0-9] matches any non-digit character
- There are a few predefined character classes
Character Class | Meaning |
---|---|
\d | any digit [0-9] |
\w | any word character [0-9a-zA-Z_] |
\s | any whitespace [ \t\n\r] |
. | any character (except \n) |
\D | any non-digit character [^0-9] |
\W | any non-word character [^\w] |
\S | any non-space character [^\s] |
- The following operators can be used to match
the same expression repeatedly
- These operators are greedy: they match as
much text as possible
Operator | Meaning |
---|---|
* | match 0 or more times |
+ | match 1 or more times |
? | match 1 or 0 times |
{n} | match exactly n times |
{n,} | match at least n times |
{n. m} | match at least n but not more than m times |
Special Characters
- There are some important special characters
- You can use ^ and $ to make sure your strings
don't contain garbage- This is good practice for validating user input
Special | Meaning |
---|---|
^ | match the beginning of the string |
$ | match the end of the string (or before the newline) |
if re.match(r'^\w*$', filename):
// this is a safe filename
- Useful functions in re module
Function | Meaning |
---|---|
match() | match pattern to string from the beginning |
search() | search for first occurrence of pattern in string |
compile() | compile a pattern for faster match |
findall() | find all (non-overlapping) occurrences of pattern |
finditer() | like findall but returns an iterator instead of list |
split() | split string according to pattern delimiter |
sub() | replace all occurrences of pattern by a string |
>>> re.findall('\w+', 'ali-ha 12!')
['ali', 'ha', '12']
- Modifiers that appear after the second control
aspects of the RE matching process
Modifier | Meaning |
---|---|
re.I | performs case-insensitive matching |
re.M | treats string as a multiline string |
re.S | makes . match any character including newline |
re.X | ignores whitespace in the pattern (for readability) |
re.A | Makes several escapes like \w, \b, \s and \d match only on ASCII characters with the respective property. |
>>> re.findall('^a\w+', 'ali\nA12!', re.M | re.I)
['ali', 'ha', '12']
- The output of match() and search() functions, if successful, is a match object
- Match objects have three primary methods,
group(), groups() and groupdict()
>>> re.match('(\w+)-(\w+)', 'ali-ha').group()
'ali-ha'
>>> re.match('(\w+)-(\w+)', 'ali-ha').groups()
('ali', 'ha')
>>> re.match('(?P<k>\w+)', 'ali-ha').groupdict()
{'k': 'ali'}
Exception Handling
Exception Handling
-
Syntax Errors vs. Exceptions
In [1]: while True print('Hello world')
File "<ipython-input-1-2b688bc740d7>", line 1
while True print('Hello world')
^
SyntaxError: invalid syntax
In [2]: 10 * (1/0)
---------------------------------------------------------------------------
ZeroDivisionError Traceback (most recent call last)
<ipython-input-2-0b280f36835c> in <module>
----> 1 10 * (1/0)
ZeroDivisionError: division by zero
Exception Handling
- try–except block
In [1]: try:
...: x = int(input("Please enter a number"))
...: except ValueError:
...: print("Input some number!")
Please enter a number 12
Exception Handling
- Multiple excepts
In [1]: import sys
...:
...: try:
...: f = open('myfile.txt')
...: s = f.readline()
...: i = int(s.strip())
...: except IOError as err:
...: print("I/O error: {0}".format(err))
...: except ValueError:
...: print("Could not convert data to an integer.")
...: except:
...: print("Unexpected error:", sys.exc_info()[0])
...: raise
I/O error: [Errno 2] No such file or directory: 'myfile.txt'
Exception Handling
- try–except–else
-
Better than adding additional code to try block
-
In [1]: try:
...: f = open(arg, 'r')
...: except IOError:
...: print('cannot open', arg)
...: else:
...: print(arg, 'has', len(f.readlines()), 'lines')
...: f.close()
Exception Handling
- try–except–else–finally–😱😱😱😱
In [1]: def divide(x, y):
...: try:
...: result = x / y
...: except ZeroDivisionError:
...: print("division by zero!")
...: else:
...: print("result is", result)
...: finally:
...: print("executing finally clause")
Raising Exceptions
raise keyword
Is it useful?
In [1]: raise NameError('HiThere')
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-1-72c183edb298> in <module>
----> 1 raise NameError('HiThere')
NameError: HiThere
In [1]: if not UserAccount.objects.filter(id=10).exists():
...: raise Http4
Raising Exceptions
-
raise keyword
In [1]: try:
...: raise NameError('HiThere')
...: except NameError:
...: print('An exception flew by!')
...: raise
Python Web Server
Python Web Server
python3 -m "http.server" 8001
def run(server_class=HTTPServer, handler_class=BaseHTTPRequestHandler):
server_address = ('', 8000)
httpd = server_class(server_address, handler_class)
httpd.serve_forever()
Python Web Server
- Server class
- necessary communication between client & server
- HTTPServer (subclass of TCPServer)
- Handler class
- processing the request and prepare response
Handler Class
You must give HTTPServer a RequestHandlerClass
class http.server.BaseHTTPRequestHandler(request, client_address, server)
class http.server.SimpleHTTPRequestHandler(request, client_address, server)
class http.server.CGIHTTPRequestHandler(request, client_address, server)
BaseHTTPRequestHandler
- Handle HTTP request
- do_GET(), do_POST(), etc.
-
urllib.parse
-
urllib.parse.urlparse is your friend
- urllib.parse.parse_qs is your best friend.
-
Web Server
Web Server
import socket
HOST, PORT = '', 8888
listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
listen_socket.bind((HOST, PORT))
listen_socket.listen(1)
print 'Serving HTTP on port %s ...' % PORT
while True:
client_connection, client_address = listen_socket.accept()
request = client_connection.recv(1024)
print request
http_response = """\
HTTP/1.1 200 OK
Hello, World!
"""
client_connection.sendall(http_response)
client_connection.close()
Web Server
Web Server
$ telnet localhost 8888
Trying 127.0.0.1 …
Connected to localhost.
GET /hello HTTP/1.1
HTTP/1.1 200 OK
Hello, World!
Frameworks
WSGI
WSGI is not a web server, a python module, a framework, an API, even a software or a cat (not sure about last one).
WSGI
def run_application(application):
"""Server code."""
# This is where an application/framework stores
# an HTTP status and HTTP response headers for the server
# to transmit to the client
headers_set = []
# Environment dictionary with WSGI/CGI variables
environ = {}
def start_response(status, response_headers, exc_info=None):
headers_set[:] = [status, response_headers]
# Server invokes the ‘application' callable and gets back the
# response body
result = application(environ, start_response)
# Server builds an HTTP response and transmits it to the client
…
def app(environ, start_response):
"""A barebones WSGI app."""
start_response('200 OK', [('Content-Type', 'text/plain')])
return ['Hello world!']
run_application(app)
WSGI Interface
-
environ: everything in os.environ + data about HTTP request
- All HTTP headers are available as
HTTP_{headername} - Some variables related to WSGI, named wsgi.*
- All HTTP headers are available as
- start_response: is a callable which the application calls to set the status code & headers for response (before the body)
- return value is an iterable object.
WSGI Interface
Title Text
Python 103
By Behnam Hatami
Python 103
python / Web Programming Course @ SUT, Fall 2018
- 1,628