Networking: Layer 5 Session
Session Layer Covers
- Long-Term Connections
- Data line it 1 MegaByte/sec (MB/s)
- How can we speed things?
- Compress 8 MegaBytes into 1 MegaByte
- Now we can transmit 8 MB in 1 sec
- Tradeoff network time for CPU time to compress
- Run Length Encoding
- Huffman coding
- Also need a format to store data (deflate)
Run Length Encoding
This is about the simplest compression algorithm.
Imagine you have an image with three-byte colors (white and green) that are:
You could represent that in a shorter fashion with:
David Huffman invented Huffman coding in 1951 as a final project while taking a class at MIT. Huffman coding uses a binary tree to track the most frequently used data patterns in a set of data.
LZ77 / LZSS / Lempel-Ziv
LZ77 was an algorithm invented by Abraham Lempel and Jacob Ziv and published in 1977, which uses a dictionary coder. A dictionary coder has a dictionary of symbols to represent frequently used patterns in the data. LZSS is a later revision of the original LZ77 algorithm.
Deflate is a file format that divides data into blocks. Blocks can be uncompressed, compressed with Huffman coding, or compressed with LZSS.
(Not all data can be compressed, particularly data that is already compressed.)
Deflate / PKZIP
- Deflate data streams (files usually) were originally created with a MS-DOS tool called PKZIP made in 1989.
- The original algorithm was patented.
- An open-source algorithm was created and published in RFC 1951. (GZip)
- This is why Windows usually ends files with .zip and other platforms use .gz.
- Deflate is a very common algorithm/format now, many languages have built-in support.
- index.html = 31.6 KB (32,411 bytes)
- index.zip = 8.33 KB (8,537 bytes)
- snowperson.png = 20.5 KB (21,000 bytes)
- snowperson.bmp = 3.26 MB (3,420,438 bytes)
- Deflate: the most commonly used compressed file format today. It uses the Huffman encoding and LZ77 algorithms for producing compressed data. Deflate is described in detail with RFC 1951. Confusingly, deflate is also sometimes referred to as an algorithm.
- Zlib: a popular, open-source library of code that produces and reads deflate formatted files. This is not code an end-user can run; it is a library of code other programs can call.
- Gzip: a popular open-source command-line program that zips and unzips files using zlib.
- .gz: a common file extension for a single file that has been compressed, usually with gzip.
- For example, myfile.txt would compress to myfile.txt.gz.
- .tar: a file extension for a uncompressed file format that allows multiple files and directories to be stored as one file. This allows not only the data in a file to be stored, but also the file name, file date, permissions, and owner.
- For example, all the files in my_documents can be compressed into the file my_documents.tar.
- .tar.gz and .tgz: multiple files first placed into one file with “tar” and then zipped via deflate.
- For example, my_documents.tar.gz.
- .zip: An alternative to the .tar and .gz combination for compressing multiple files together, used by PKZIP.
- While zip uses still uses deflate, it combines multiple files in a different file format than .tar.
- That is, .tgz and .zip both apply the deflate algorithm, but use different formats to keep separate the individual files and store each file’s permissions, who owns each file, and so on.
- .png: an image file format that uses deflate to compress the image data.
- Some types of compression will lose data to make files even smaller.
- JPEG does this form images. MPEG for movies.
735 KB (753,125 bytes)
187 KB (191,756 bytes)
61.3 KB (62,849 bytes)
Keeping Data Save
- Encryption on the web
- Authentication and authorization
- Single Sign On
- Keep data safe while going across the network.
- Keep data safe while stored in a file or database.
- Famous data breaches:
- One of the original encryption methods was the Caesar Cypher
- Move the letters up or down the alphabet. A becomes B. B becomes C. "Dog" becomes "Eph".
- Used by Caesar to keep messages to his commanders secret.
- This is a symmetric cypher. Reversing the algorithm reveals the original message.
- Encryption Algorithms have evolved over time.
- Once common standards like DES are no longer considered secure. Make sure you look for the most current standard.
- The current standard is the Advanced Encryption Standard, AES.
- Symmetric Encryption uses keys
- Keys are sized, usually in powers of two
- An 8-bit key would have 2^8 = 256 possible values
- Easy to guess
- A 32 bit key would have 2^32 = 4 billion values
- A 256 bit key would have 2^256 possible value
The universe will end before we can guess all those
- Writing out all the bytes of a key with numbers takes too long:
230 1 32 78 (etc)
- Instead, we write in hexadecimal:
- Each number/letter is 4 bits. 0-9A-F = 16 possible values
- Block Ciphers encode everything in blocks
- AES uses 128 bit blocks (16 bytes).
- Encode something 17 bytes long?
- 2 bytes, the last with 15 bytes of padding
- Cryptographically, it is better to pad with random data
- Issue with symmetric keys:
- Both sides have to have the key.
- So how do you pass the key?
- Asymmetric encryption helps:
- Passing symmetric keys
- Authenticating things like websites. Is this really https://store.eample?
- Asymmetric keys have TWO keys.
- Public key
- Private key
- A message encrypted with public key can only be decrypted by the private key.
- A message encrypted with private key can only be decrypted by the public key.
- Anyone can use public key to encrypt a message. Only I can decrypt it.
- This allows any client to create a symmetric key and send it to me securely! Now we have an encrypted link.
- Bonus: I can encrypt a message with the private key, anyone can decrypt it with the public key (so the message isn't secret), but they know it came from me, not an imposter.
- Asymmetric encryption is slower than symmetric encryption.
- Usually we just use it to pass a symmetric key, then switch to use that.
- There are public "trusted" repositories of public keys.
- We use those repositories to make sure that when we connect to https://mybank.example it is really that bank and not different computer pretending.
- You can see who we trust to say who to trust (ha) by following these directions:
- You can create your own public/private key combo on the command-line with ssh-keygen which we'll do for a project.
- Terminal tools (putty, mobaxterm) often have an option for creating a key pair.
- You can create a public/private key for encrypting a website at https://letsencrypt.org
- There are multiple asymmetric key algorithms
- Rivest-Shamir-Adleman (RSA) is one of the first, and most widely used, systems for creating and using asymmetric encryption.
- AES - Symmetric
- RSA - Asymmetric
Common Encryption Attacks
- A system sits in the middle, and passes traffic back to the intended destination.
Common Encryption Attacks
- Replay Attack
- I intercept an encrypted message
- I see the encrypted message turns on a motor
- When I want to turn on the motor, I just replay the encrypted message.
- No need to decrypt
- Can protect against this with a sequence number or time stamp.
- Who is this user?
- Authenticate with a factor:
- Something the user knows (password)
- Something about who the user is (fingerprint)
- Something the user has (key)
- Improve security by having two of these
- (This does NOT mean two passwords, must be different factors.)
- Defeating passwords:
- Common password list
- Use an already compromised list
- Trick the user with a fake site
- Defeating who the user is
- Face recognition
- Sleeping person
- Similar looking person
- Get fingerprint from something touched
- Face recognition
- Defeating what the user has
- Copy via a photo or press
- Pick locks
- Read the RFID and copy it
- Phone - SMS or e-mail for password reset
- Ask phone company to transfer number
- Does the user have permission to do this?
- Common web vulnerability:
- System shows IDs of my accounts: 101, 105, 108
- I modify the web page and ask for account 102
- Does the system recheck that I have permission for account 102?
Storing Sensitive Data
- If you store an SSN or credit card number, make sure it is stored encrypted.
- I once saw a DB text where they had people store SSN in plain text. It was awful.
Storing Sensitive Data
- Passwords can not be stored:
- Clear text
- Passwords must be stored with a cryptographic hash
- A cryptographic hash is a one-way encryption algorithm
- Can't "unhash"
- Take a password like 'hi' and encrypt it to 'a04fb3'
- When logging in, take their 'hi' entry and see if the hash matches.
- Hashes are vulnerable to a dictionary attack
- We hash EVERYTHING, then reverse look-up the hash to get the password.
- We can avoid this by using a salt
- Salt is randomly generated. Like 'fre'
- Add this to the password 'hi' + 'fre' = 'hifre' -> hash is '445cd'
- Store in database the salt in one column, and the hash in another.
- Can't use dictionary attack because salt id different for each person.
- Lightweight Directory Access Protocol
- You can look up user information with this
- Tell you what groups are they in, such as 'Professor', 'Student', 'Admin'
- Using Python and the LDAP3 library, you can log in and get user info in just a few lines of code. (code is in the book)
Single Sign On
- Terminal Emulation
- Secure Shell (SSH)
- Hyper-Text Transfer Protocol (HTTP)
- Secure Hyper-Text Transfer Protocol (HTTPS)
- Plain text terminal, no encryption
- RFC 15
- Originally made for teletype machines, before computers had screens
- Still useful for quick connections.
- Was used for Internet of Things (IoT) because low-powered computers that ran power networks and valves couldn't do anything more.
- Eventually, these had to be updated as security became more important. (Talk about war-dialing.)
- How do you control what character appears where? How to move the cursor around? Do primitive graphics even? Set text color?
- Encrypted Telnet
- Can also be used for remote windows (run on one computer, window appears on the other)
- Can be used for file transfer
- HTTPS uses the OS's Certificate Authorities (CA) to prevent man-in-the-middle attacks.
- Can add additional CAs
- CA can offer:
- Domain validation - they know you own the domain
- Organization validation - validate you are a real person and organization
- Extended validation - 'verified' user with some more checks. (not really worth much)
- Used to keep a web session alive as the browser connects/disconnects
- Usually consists of a key like "SESSIONID" and a long random value like "ELSKJWRDSOXIINEWR25890902"
- Server can "set" a cookie. After set, browser will send cookie each time.
- Server stores session id, key value, key data in a database.
3rd Party Cookies
- trackerwebsite.example wants to track users
- Sets up agreement with multiple websites
- site1.example as part of HTML says an image or script must be loaded from trackerwebsite.example
- site2.example as part of HTML says an image or script must be loaded from trackerwebsite.example
- trackerwebsite.example now knows if a user goes to both websites, because it uses the same cookie and exists on both websites
- Google offers lots of free stuff, so people do this a lot
Networking: Layer 5 Session
By Paul Craven