COMP1701-004

fall 2023

lec-19

Almost Nomovember

Marc Schroeder is present in class today and is conducting observations as part of an approved research study. No audio or video recording is being conducted.

 

His fieldnotes may record general observations about the overall nature of classroom activities and discussions. However, these will not include personal or identifying details about individuals or their participation, except for those students who are participating in other aspects of the study.

 

If you have any questions or concerns, then please let me know, and I’ll direct you to the appropriate resource.

A4 Correction

Get the updated files into your Codespace.

Diff tools are your friend.

let's talk about these  things today:

 reading from text files

 writing to text files

The 10,000 foot view of today's topic

Sometimes stuff is in a text file and you want it badly. Yes you do.

Sometimes you want to put stuff in a text file. For posterity, or something.

Dear Future Self...

The details of working with text files in various languages can vary quite a bit.

But while the details of working with files in Python will differ from Java, C++, JS, PHP, etc...the general steps are going to be reassuringly similar.

reading from text files

method 1
read() the whole file as a string

A very simple thing we can do with smaller text files is just read the whole darn thing into a string, then do whatever we want with that string.

TASK 1

Write a program that reads in somebody's A3 assignment file and then writes out a "sushified" version of that assignment - to another file - with all o's replaced with 🍥.

MAY I HAVE A VOLUNTEER FROM THE AUDIENCE?

Narutomaki - a type of kameboko

emoji from emojiterra

Write a program that reads in somebody's A3 assignment file and then writes out a "sushified" version of that assignment with all o's replaced with 🍥.

Every time we read from a file, we will need code that does 3 things:

  1. opens the file in the desired mode
  2. processes the file in some way
  3. closes the file

Let's dig in.

file_reader = open("assign3.py", "r")

1. open the file in the desired mode

TASK 1

Write a program that reads in somebody's A3 assignment file and then writes out a "sushified" version of that assignment with all o's replaced with 🍥.

open is a function you can just use in Python - no import needed

the first argument to open is a string - the path to the file you want to open

the second arg to open is the mode - here, we want to open for (r)eading

open returns a useful minion - he's "smart", like strings and lists are, so we can "talk" to him with various methods

JP: remember to demo relative file paths by putting the data file in a folder, triple-nested folder (because they have to figure out the double by themselves)

file_reader = open("assign3.py", "r")

assignment_text = file_reader.read()

TASK 1

Write a program that reads in somebody's A3 assignment file and then writes out a "sushified" version of that assignment with all o's replaced with 🍥.

we start talking to our file_reader minion, by calling a method...

...the read method. It returns EVERYTHING in the text file as one big string...

...which we store in a hopefully well-named variable.

2. process the file

JP: barf out the text to console so they don't think you're a lyin' varmit.

🙋🏻‍♂️❓🙋🏻‍♀️What character is used to represent a new line in a string? How could we check? 

file_reader = open("assign3.py", "r")

assignment_text = file_reader.read()

sushified_text = assignment_text.replace("o", "🍥")

TASK 1

Write a program that reads in somebody's A3 assignment file and then writes out a "sushified" version of that assignment with all o's replaced with 🍥.

we want the string to replace all o's with 🍥, so we "talk" to the string using...

...the replace method. It returns a NEW string, with all X replaced by Y...

...which we store in a well-named variable.

2. process the file (cont'd)

JP: barf out the replaced text. Oh, and go over that replace/immutable thing, too.

X
Y

🙋🏻‍♂️❓🙋🏻‍♀️What was our goal again here? What can we do with our string to make our goal happen?

file_reader = open("assign3.py", "r")

assignment_text = file_reader.read()

sushified_text = assignment_text.replace("o", "🍥")

file_reader.close()

TASK 1

Write a program that reads in somebody's A3 assignment file and then writes out a "sushified" version of that assignment with all o's replaced with 🍥.

since we're done using the file, we tell our minion to release its grip on the file...

...by calling the close method. Forgetting to do this may be bad news.

3. close the file

JP: try renaming the file before closing. - this might work in a Codespace, but will cause you grief in Windows!

...

You're reading this too late, aren't you.

file_reader = open("assign3.py", "r")

assignment_text = file_reader.read()

sushified_text = assignment_text.replace("o", "🍥")

file_reader.close()

TASK 1

Write a program that reads in somebody's A3 assignment file and then writes out a "sushified" version of that assignment with all o's replaced with 🍥.

Here's what we've got so far...

...but that's only half the job!

We want to get this delicious text into a file now.

JP: any questions?

writing to text files

Write a program that reads in somebody's A3 assignment file and then writes out a "sushified" version of that assignment with all o's replaced with 🍥.

When writing files, we still need to do these 3 things:

  1. open the file in the desired mode
  2. process the file in some way
  3. close the file
file_writer = open("assign3_sushified.py", "w")

1. open the file in the desired mode

TASK 1

Write a program that reads in somebody's A3 assignment file and then writes out a "sushified" version of that assignment with all o's replaced with 🍥.

here's the function open again, called with different arguments this time

the first argument is - again - the path to the file you want to open

the second arg to open is the mode - here, we want to open for (w)riting

here we have another minion - he's the same type of thing as before, but since we've created him with "w", we can "talk" to him using some different methods

JP: open up the file

🙋🏻‍♂️❓🙋🏻‍♀️Where will the file be saved? How can we save it in a subfolder? 

file_writer = open("assign3-sushified.py",
                   "w",
                   encoding="utf-8")

file_writer.write(sushified_text)

2. process the file

TASK 1

Write a program that reads in somebody's A3 assignment file and then writes out a "sushified" version of that assignment with all o's replaced with 🍥.

we ask our minion to write our string to the file...

...using the write method...

JP: feign surprise at the file contents

...passing the string to be written to the file as an argument

file_writer = open("assign3-sushified.py",
                   "w",
                   encoding="utf-8")

file_writer.write(sushified_text)

file_writer.close()

3. close the file

TASK 1

Write a program that reads in somebody's A3 assignment file and then writes out a "sushified" version of that assignment with all o's replaced with 🍥.

we ask our minion to close its connection to the file...

...using the close method...

the whole enchilada

TASK 1

Write a program that reads in somebody's A3 assignment file and then writes out a "sushified" version of that assignment with all o's replaced with 🍥.

file_reader = open("assign3.py", "r")

assignment_text = file_reader.read()

sushified_text = assignment_text.replace("o", "🍥")

file_reader.close()

file_writer = open("assign3-sushified.py", "w")

file_writer.write(sushified_text)

file_writer.close()

chunking things up into functions

It's nice to keep things separated into functions, each one doing one job.

with wee functions

def file_contents(file_path: str) -> str:
    file_reader = open(file_path, "r")
    text = file_reader.read()
    file_reader.close()

    return text


def write_contents(file_path: str, contents: str) -> None:
    file_writer = open(file_path, "w")

    file_writer.write(contents)

    file_writer.close()


def sushified(text: str) -> str:
    return text.replace("o", "🍥")


def main() -> None:
    assignment_text = file_contents("assign3.py")
    sushified_text = sushified(assignment_text)
    write_contents("assign3-sushified.py", sushified_text)


main()

...and that our main tells a clear, simple story.

Notice that our functions are short, expressive, and easy to understand...

reading from text files

method 2
use a for loop to walk through each line in a file

If the file contains records you want to process, we usually want to process that file line by line - record by record. 

  1. Process all the records and calculate some value as you go.
  2. Turn all/some of the records into a list that you can use for other purposes.
  3. Process individual records, immediately doing something with each result.

Three common scenarios:

Let's try the first thing.

TASK 2

Find the number of traffic incidents in Calgary in 2023 that involved a cyclist.

 

data.calgary.ca has many cool datasets like these traffic-related ones

As a side note, Google's dataset search tool is another good place to find lots of interesting data to play around with.

Westbound 16 Avenue at Deerfoot Trail NE ,Stalled vehicle.  Partially blocking the right lane,6/21/2022 7:31,NE,-114.0266867,51.06748513
11 Avenue and 4 Street SW ,Traffic incident. Blocking multiple lanes,6/21/2022 4:02,SW,-114.0714806,51.04262449
68 Street and Memorial Drive E ,Traffic incident.,6/20/2022 23:53,NE,-113.9355533,51.05247351
...

Let's say our traffic incidents info is in a file called incidents.csv with records that look like this:

CAREFUL! This is a TEXT FILE, so any "numbers" in here are actually strings!

incidents.csv

location info

description

start date/time

city quadrant

longitude

latitude

🙋🏻‍♂️❓🙋🏻‍♀️Looking at the data, what do you suppose the delimiter is?

🙋🏻‍♂️❓🙋🏻‍♀️How do we know whether a cyclist was involved in an incident?

fields

We start off by summoning our file reading minion as before:

file_reader = open("incidents.csv", "r")

We could ask our minion to read().

file_reader = open("incidents.csv", "r")

for line in file_reader:
  #do something useful with line

Inside the loop, line will refer to one LINE from the file.

Westbound 16 Avenue at Deerfoot Trail NE ,Stalled vehicle.  Partially blocking the right lane,6/21/2022 7:31,NE,-114.0266867,51.06748513
11 Avenue and 4 Street SW ,Traffic incident. Blocking multiple lanes,6/21/2022 4:02,SW,-114.0714806,51.04262449
68 Street and Memorial Drive E ,Traffic incident.,6/20/2022 23:53,NE,-113.9355533,51.05247351

Instead, let's use our minion to iterate (loop through) every line in the file like so:

🙋🏻‍♂️❓🙋🏻‍♀️What would the result of that be? Could there be any issues with that?

In this file, the first time through the loop, line will be what? 

The second time through, it will be what?

🙋🏻‍♂️❓🙋🏻‍♀️Is "line" a good name?
Can you suggest an alternative?

If we're trying to count something, we'll need a variable to track it, right?

file_reader = open("incidents.csv", "r")

num_cyclist_incidents = 0

for line in file_reader:
  # 1) see if line involves a cyclist and...
  # 2) ...bump up our variable if it does

You totally know how to do 1) and 2) mentioned in the comments above!

So get coding.

🙋🏻‍♂️❓🙋🏻‍♀️What's this pattern called?

🙋🏻‍♂️❓🙋🏻‍♀️What was our goal again? (Keep your eyes on the prize.)

An exercise on your own dime

Turn the code from the previous slide into a well-named function.

What parameters should it have?

Return type?

  1. Process all the records and calculate some value as you go.
  2. Turn all/some of the records into a list that you can use for other purposes.
  3. Process individual records, immediately doing something with each result.

TASK 3

Create a list of all the incidents in the file that took place in the SW. Assume we're only interested in the latitude and longitude of those incidents, so our list should just have those things.

We'll start off the same way, but this time what do we want to 

file_reader = open("some-file.txt", "r")

result = ???

for line in file_reader:
  # 1) if the record is of interest...
  # 2) ...put it in the list we're building

You totally know how to do 1) and 2) mentioned in the comments above!

So get coding.

🙋🏻‍♂️❓🙋🏻‍♀️What's this pattern called?

  1. Process all the records and calculate some value as you go.
  2. Turn all/some of the records into a list that you can use for other purposes.
  3. Process individual records, immediately doing something with each result.

lec-19

By Jordan Pratt

lec-19

reading from files | writing to files

  • 209