Python 103

Web Programming Course

SUT • Fall 2018

Outline

  • Files

  • Modules and Packages

  • Reading Web Pages

  • Regular Expressions
  • Python and Regexes
  • Exception Handling

  • Python Web Server

Files

Opening Files

  • We can open a file for reading/writing using open

    function

file = open('test.txt', 'w')
file.write('Hi!\n')
file.write('This is a test.\n')
file.close()

with open('test.txt', 'w') as file:
    file.write('Hi!\n')
    file.write('This is a test.\n')


class controlled_execution:
    def __enter__(self):
        set things up
        return thing
    def __exit__(self, type, value, traceback):
        tear things down

Reading Files

  • Files handles are iterable
  • We can use handles to read files line by line

file = open('test.txt', 'r')
for line in file:
    print(line.rstrip())

file.close()

Reading Files at Once

  • We can read the entire content of a file:

    • as a single string by read, or

    • as a list of strings by readlines

in[0]: open('test.txt').read()
out[1]: 'Hi!\nThis is a test.\n'

in[1]: open('test.txt').readlines()
out[1]: ['Hi!\n', 'This is a test.\n']

Modules and Packages

Modules

# mymodule.py
def foo():
pass
bar = 10

# test.py
import mymodule
mymodule.foo()
  • A module is a file containing Python definitions
    and statements to be used in other Python
    programs

Importing Modules

  • There are three different ways to import a
    module

import math
math.pi

from math import pi, cos
cos(pi)

import math as m
m.pi

Packages

  • We can organize modules inside packages, and access them via dot notation
  • A package is simply a directory containing an
    (empty) __init__.py file
App/
    __init__.py
    test.py
    Tools/
        __init__.py
        utils.py
        mytools.py

from App.Tools import utils

Reading Web Pages

Retrieve a Page

  • We can use urlretrieve function to download any
    kind of content from the Internet
  • The function is located in request module in urllib
    package
from urllib.request import urlretrieve

url = 'http://ce.sharif.edu/courses'
file_name = 'courses.html'

urlretrieve(url, file_name)

Opening a Socket

  • We can alternatively open a socket to fetch the
    remote file
  • The socket object is pretty like the file handle
from urllib.request import urlopen

url = 'http://ce.sharif.edu/courses'
socket = urlopen(url)
text = str(url.readall())

socket.close()

Request

  • Making a request with Requests is very simple.

>>>import requests

>>> r = requests.get('https://api.github.com/user', auth=('user', 'pass'))
>>> r.status_code
200
>>> r.headers['content-type']
'application/json; charset=utf8'
>>> r.encoding
'utf-8'
>>> r.text
u'{"type":"User"...'
>>> r.json()
{u'private_gists': 419, u'total_private_repos': 77, ...}

Regular Expressions

Regular Expressions

  • A regular expression (aka regex or regexp) is a
    sequence of characters that forms a search
    pattern
  • Python supports regexes through the standard
    library re module
  • meta characters
    • . ^ $ * + ? { } [ ] \ | ( )
import re

m = re.match('me', 'meanwhile')
if m is not None:
    print(m.group())

Regular Expression Syntax

  • Regular expressions are strings containing text
    and special characters (such as ? and *) that
    describe a pattern
  • The choice | operator creates a regular
    expression that matches one of two things
if re.match('Ali|Hamid', user):
    // user is valid

Character Classes

  • The character class operator [] allows to match
    any character within the class
    • [abcd] is equivalent to a|b|c|d
  • We can use a range of characters within a class
    • [a-f] is equivalent to [abcdef]
  • We can also reverse a class using ^ operator
    • [^0-9] matches any non-digit character
  • There are a few predefined character classes
Character Class Meaning
\d any digit [0-9]
\w any word character [0-9a-zA-Z_]
\s any whitespace [ \t\n\r]
. any character (except \n)
\D any non-digit character [^0-9]
\W any non-word character [^\w]
\S any non-space character [^\s]
  • The following operators can be used to match
    the same expression repeatedly

 

 

 

 

 

 

 

  • These operators are greedy: they match as

much text as possible

Operator Meaning
* match 0 or more times
+ match 1 or more times
? match 1 or 0 times
{n} match exactly n times
{n,} match at least n times
{n. m} match at least n but not more than m times

Special Characters

  • There are some important special characters

 

 

 

 

 

  • You can use ^ and $ to make sure your strings
    don't contain garbage
    • This is good practice for validating user input
Special Meaning
^ match the beginning of the string
$ match the end of the string (or before the newline)
if re.match(r'^\w*$', filename):
    // this is a safe filename
  • Useful functions in re module
Function Meaning
match() match pattern to string from the beginning
search() search for first occurrence of pattern in string
compile() compile a pattern for faster match
findall() find all (non-overlapping) occurrences of pattern
finditer() like findall but returns an iterator instead of list
split() split string according to pattern delimiter
sub() replace all occurrences of pattern by a string
>>> re.findall('\w+', 'ali-ha 12!')
['ali', 'ha', '12']
  • Modifiers that appear after the second control
    aspects
    of the RE matching process
Modifier Meaning
re.I performs case-insensitive matching
re.M treats string as a multiline string
re.S makes . match any character including newline
re.X ignores whitespace in the pattern (for readability)
re.A Makes several escapes like \w, \b, \s and \d match only on ASCII characters with the respective property.
>>> re.findall('^a\w+', 'ali\nA12!', re.M | re.I)
['ali', 'ha', '12']
  • The output of match() and search() functions, if successful, is a match object
  • Match objects have three primary methods,
    group(), groups() and groupdict()
>>> re.match('(\w+)-(\w+)', 'ali-ha').group()
'ali-ha'
>>> re.match('(\w+)-(\w+)', 'ali-ha').groups()
('ali', 'ha')
>>> re.match('(?P<k>\w+)', 'ali-ha').groupdict()
{'k': 'ali'}

Exception Handling

Exception Handling

  • Syntax Errors vs. Exceptions

In [1]: while True print('Hello world')                                                                                                                                                                     
  File "<ipython-input-1-2b688bc740d7>", line 1
    while True print('Hello world')
                   ^
SyntaxError: invalid syntax


In [2]: 10 * (1/0)                                                                                                                                                                                          
---------------------------------------------------------------------------
ZeroDivisionError                         Traceback (most recent call last)
<ipython-input-2-0b280f36835c> in <module>
----> 1 10 * (1/0)

ZeroDivisionError: division by zero

Exception Handling

  • try–except block
In [1]: try: 
   ...:     x = int(input("Please enter a number")) 
   ...: except ValueError: 
   ...:     print("Input some number!")                                                                                                                                                                     
Please enter a number 12 

Exception Handling

  • Multiple excepts
In [1]: import sys 
   ...:  
   ...: try: 
   ...:     f = open('myfile.txt') 
   ...:     s = f.readline() 
   ...:     i = int(s.strip()) 
   ...: except IOError as err: 
   ...:     print("I/O error: {0}".format(err)) 
   ...: except ValueError: 
   ...:     print("Could not convert data to an integer.") 
   ...: except: 
   ...:     print("Unexpected error:", sys.exc_info()[0]) 
   ...:     raise                                                                                                                                                                                           
I/O error: [Errno 2] No such file or directory: 'myfile.txt'

Exception Handling

  • try–except–else
    • Better than adding additional code to try block

In [1]: try: 
   ...:     f = open(arg, 'r') 
   ...: except IOError: 
   ...:     print('cannot open', arg) 
   ...: else: 
   ...:     print(arg, 'has', len(f.readlines()), 'lines') 
   ...:     f.close()        

Exception Handling

  • try–except–else–finally–😱😱😱😱
In [1]: def divide(x, y): 
   ...:     try: 
   ...:         result = x / y 
   ...:     except ZeroDivisionError: 
   ...:         print("division by zero!") 
   ...:     else: 
   ...:         print("result is", result) 
   ...:     finally: 
   ...:         print("executing finally clause")      

Raising Exceptions

  • raise keyword






  • Is it useful?

In [1]: raise NameError('HiThere')                                                                                                                                                                          
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-1-72c183edb298> in <module>
----> 1 raise NameError('HiThere')

NameError: HiThere
In [1]: if not UserAccount.objects.filter(id=10).exists(): 
   ...:     raise Http4           

Raising Exceptions

  • raise keyword

In [1]: try: 
   ...:     raise NameError('HiThere') 
   ...: except NameError: 
   ...:     print('An exception flew by!') 
   ...:     raise   

Python Web Server

Python Web Server

  • http.server library

    • HTTP server in one line (🤗🤗🤗)

 

python3 -m "http.server" 8001
def run(server_class=HTTPServer, handler_class=BaseHTTPRequestHandler):
    server_address = ('', 8000)
    httpd = server_class(server_address, handler_class)
    httpd.serve_forever()

Python Web Server

  • Server class
    • necessary communication between client & server
    • HTTPServer (subclass of TCPServer)
  • Handler class
    • processing the request and prepare response

 

Handler Class

You must give HTTPServer a RequestHandlerClass

class http.server.BaseHTTPRequestHandler(request, client_address, server)
class http.server.SimpleHTTPRequestHandler(request, client_address, server)
class http.server.CGIHTTPRequestHandler(request, client_address, server)

BaseHTTPRequestHandler

  • Handle HTTP request
  • do_GET(), do_POST(), etc.
  • urllib.parse

    • urllib.parse.urlparse is your friend

    • urllib.parse.parse_qs is your best friend.

Web Server

Web Server

import socket

HOST, PORT = '', 8888

listen_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
listen_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
listen_socket.bind((HOST, PORT))
listen_socket.listen(1)
print 'Serving HTTP on port %s ...' % PORT
while True:
    client_connection, client_address = listen_socket.accept()
    request = client_connection.recv(1024)
    print request

    http_response = """\
HTTP/1.1 200 OK

Hello, World!
"""
    client_connection.sendall(http_response)
    client_connection.close()

Web Server

Web Server

$ telnet localhost 8888
Trying 127.0.0.1 …
Connected to localhost.
GET /hello HTTP/1.1

HTTP/1.1 200 OK
Hello, World!

Frameworks

WSGI

WSGI is not a web server, a python module, a framework, an API, even a software or a cat (not sure about last one).

WSGI

  • WSGI is an interface specification by which server & application communicate.
  • PEP-333, PEP-3333
  • Every application written to WSGI spec, works with any server written to that spec.
def run_application(application):
    """Server code."""
    # This is where an application/framework stores
    # an HTTP status and HTTP response headers for the server
    # to transmit to the client
    headers_set = []
    # Environment dictionary with WSGI/CGI variables
    environ = {}

    def start_response(status, response_headers, exc_info=None):
        headers_set[:] = [status, response_headers]

    # Server invokes the ‘application' callable and gets back the
    # response body
    result = application(environ, start_response)
    # Server builds an HTTP response and transmits it to the client
    …

def app(environ, start_response):
    """A barebones WSGI app."""
    start_response('200 OK', [('Content-Type', 'text/plain')])
    return ['Hello world!']

run_application(app)

WSGI Interface

  • environ: everything in os.environ + data about HTTP request
    • All HTTP headers are available as
      HTTP_{headername}
    • Some variables related to WSGI, named wsgi.*
  • start_response: is a callable which the application calls to set the status code & headers for response (before the body)
  • return value is an iterable object.

WSGI Interface

Title Text

Python 103

By Behnam Hatami

Python 103

python / Web Programming Course @ SUT, Fall 2018

  • 1,625