Cryptography with Mathy Details

Instructor:          Andreas Park
 

 

UTM

 

Cryptography

Source: Cambridge Bitcoin Energy Consumption Index https://cbeci.org/

Cyrptographic Hashing

  1. What is hashing?
  2. Why do we use it?

Ethereum is full and using it is expensive

Definition

Takes a message/text of arbitrary length and generates a fixed length output or "digest"  

Properties

  1. Deterministic (i.e., not random)
    • the same message always generates the same digest
  2. Fast
    • you don't need much time/many computing cycles to compute a hash
  3. "unpredictable"
    • if two messages are similar, their digests should look very different
  4. not invertible
    • there is no inverse function, i.e., you cannot analytically infer the message from the digest,  nor can an attacker find the message from the digest efficiently by searching (it's VERY hard).
  5. Collision resistant
    • an attacker cannot find two messages that have the same digest in "normal" time.

Simple Application

  • Databases should not store user passwords and usernames in plain text
    • => attacker could immediately impersonate every user
  • Store as a hash: attacker cannot easily invert the username & password

What hashing functions are there

  • Many!
  • MD5
  • SHA1 (better than MD5)
  • SHA256 (better than MD5)
    • output of 256 bits; 4 bit= 1 characters => 64 characters (hexadecimals)
    • developed by the NSA
    • Code, e.g., https://www.movable-type.co.uk/scripts/sha256.html
  • SHA512
  • RIPEMD-160 (for 160 bit output)

Demo time!

https://andersbrownworth.com/blockchain/

Why are hashing functions used in blockchain?

  1. efficient way to represent data
    • always same-length output
    • => good convention
  2. small changes to data trigger large changes in hash
    • (recall the demonstration)
    • => easy to check consistency 
  3. they work as "pointers"
    • each block contains a hash of the past block
    • this hash is a pointer
    • pointers make searches easy
  4. Hashes of hashes are used to simplify data storage
    • the process of hashing hashes repeatedly creates the "Merkle Tree" 

What hashing functions are used with blockchains?

Problem: Hashes can be cracked!

cracked by "CrackStation"

Encryption

  • Problem: send message M that you want no-one to be able to read 
  • Basic idea:
    • should be easy to decrypt with the right tool
    • hard to decrypt without it

Alice wants to send Bob money without Charles seeing it

SYMMETRIC: Alice and Bob use the same key to 

encrypt and decrypt a message

Public key = Private key

ASYMMETRIC: Bob has a public and a private key

Public

Private

Digital Signatures

  • Problem: send message and ensure that the other side believes that you sent this particular message
    • worry about manipulation
    • other side may worry about proving what you did, etc.
    • => want to digitally sign the message
  • As with encryption:
    • should be easy to prove that you signed 
    • hard to forge your signature

Alice's private

Alice wants to send Bob message and provide proof that it's her 

Alice's public

Uses

  • Transaction authorization
  • Governance votes
  • Consensus protocol votes

Types

  • RSA (Rivest-Shamir-Adleman)
    • old school, fast to verify, long keys
    • not used in blockchain
  • Elliptic curve
    • shorter keys for same level of security as RSA
    • Ethereum, Bitcoin
  • BLS (Boneh–Lynn–Shacham)
    • mutliple pub keys and messages into one signature!
    • Ethereum 2.0 

Uses

  • Transaction authorization
  • Governance votes
  • Consensus protocol votes

Types

  • RSA (Rivest, Shamir, Adleman)
    • old school, fast to verify, long keys
    • not used in blockchain
  • Elliptic curve
    • shorter keys for same level of security as RSA
    • Ethereum, Bitcoin
  • BLS
    • mutliple pub keys and messages into one signature! 

Quantum-resistant signatures??? 

Summary

The main cryptographic primitives

  • Collision-resistant hash functions
  • Digital signatures

But it doesn't end here...

zk-SNARKS

  • Used for privacy and to some extent for scaling
  • Covered later in the course

 

Digital Signatures

Crypto Math

Polynomial vs exponential time

Example for "normal" or polynomial time:

  • Take deck of cards with numbers from 1 to 20.
  • Throw it in the air & land face down.
  • Now then sort from small to large
    • \(\to\) how many steps do you need?
  • Find smallest number:
    • \(\to\) at most 20 steps.
  • Find second-smallest number
    • \(\to\) at most 19 steps.
  • Total steps:
    • \(N+(N-1)+\ldots+2+1=\frac{N(N+1)}{2}\)

Example for exponential time:

  • "Travelling salesman"
  • Wants to visit \(N\) towns, and each only once.
  • Which order is best?
  • \(\to\) exponential problem (also "NP-hard")

What hashing functions are there?

  • Many!
  • MD5
  • SHA1 (better than MD5)
  • SHA256 (better than MD5)
    • output of 256 bits; 4 bit= 1 characters => 64 characters
    • developed by the NSA
    • Code, e.g., https://www.movable-type.co.uk/scripts/sha256.html
  • SHA512
  • RIPEMD-160 (for "160 bit output)

What hashing functions are used with blockchains?

  • MD5
    • cac58b5234e1f98b4c956998b8ac2e26
  • SHA1
    • 60D795AC720DEB5B29AB44F3A690A90DDF147D75
  • SHA256 
    • 9EEA6242471F9B3999F21C6FE247679CAB1EAE0B6E8431A3A1A5FAADB27051C8

Examples of "Andreas"

  • MD5
    • bcc9d898264b67515fba62598bdc58c0
  • SHA1
    • DE2737F0E88865DCDF7A33848F664FF807A3208C
  • SHA256 
    • 62D3869E008362B2DD4490D8BF9D4AFC4CED4FF34045709DD1EEAA743CF5C793

Examples of "AnDrEaS"

Why are hashing functions used with blockchains?

  1. efficient way to represent data
    • always same-length output
    • => good convention
  2. small changes to data trigger large changes in hash
    • (recall the demonstration)
    • => easy to check consistency 
  3. they work as "pointers"
    • each block contains a hash of the past block
    • this hash is a pointer
    • pointers make searches easy
  4. Hashes of hashes are used to simplify data storage
    • the process of hashing hashes repeatedly creates the "Merkle Tree" 

Cryptography

  • Foundation: let's understand the secure sending of information
  • Problem: send message M that you want no-one to be able to read 
  • Basic idea (just as with hashing):
    • should be easy to decrypt with the right tools
    • hard to decrypt without it

Some formalism

  • message M 
  • public key P
  • private key S (for "safe")
  • cipher text C
  • Two functions:
    • encode message: enc(M,P)=p(M) 
    • decode cipher: dec(C,S)=s(C)

Symmetric Encryption: Example

Letter number encoding

 
A 0 3
B 1 4
C 2 5
...
W 21 25
X 23 0
Y 24 1
Z 25 2
n~\to~n+3 \text{ mod }26
n
  • Codeword:
    • management
  • converted to numbers
    • 12 0 13 0 6 4 12 4 13 19
  • cypher
    • 15 3 16 3 9 7 15 7 16 22
  • modulo operation:
    • "the remainder"
    • example:
      • \(25/3=8 \frac{1}{3}\)
      • or \(25 \text{ mod } 3 = 1\)

Digital Signatures

Formally

  • Components:
    • Message M
    • Signature or Tag T
    • Public & private keys P & S
  • Two functions
    • Sign(M,S)
    • Check(M,T,P)

required property

if S applied to M created T, T=Sign(M,S) => Check(T,M,P)=1

Alice wants to send Bob a message and provide proof that its her.

Sa

Pa

formally: computes T=Sign(M,Sa)

formally: computes check(T,M,Pa)

Concrete Application of Signatures: RSA Asymmetric Key Cipher

  • Based on Rivest-Shamir-Adleman algorithm
  • widely used, e.g. for online banking etc
  • not exactly used in Blockchains (they use "elliptic curve" algorithms),
  • but the idea is similar and RSA is easier to explain
  • Warning: this will be the mathiest and geekiest part of the class
  • I'll do it by example.

background math (specifically, number theory)

  • terminology: numbers \(a\) and \(b\) are prime relative to one another if their greatest common divisor is 1.
  • goes back to prime factorization \(\to\) expressing a number as the product of primes.
  • example: 
    • \(2^3\cdot3^2=72,~5^2\cdot7=175\)
  • Euler's \(\phi\) function for a number \(n\) is the number of natural numbers that are prime relative to \(n\)
    • 72 and 175 have no common factors
      \(\to\) 72 is prime to 175
n
1 1
5 4
10 4
12 4
14 6
15 8
\phi(n)

background math (specifically, number theory)

  • Some extra results.
    • For a prime number \(n\) we have \(\phi(n)=n-1\)
    • For a prime numbers \(n,n\) we have
      \(\phi(n\cdot m)=(n-1)(m-1).\)
  • When \(n\) and \(m\) are prime to one another then there exists integers \(x\) and \(y\) such that
    • \(x\cdot m+y\cdot n=1\)
  • Let's do this by example (we use "long division") for 72 and 175

background math (specifically, number theory)

\begin{array}{rclcl} 175 \div 72&=&2\cdot 72+31&=&2 \text{ mod } 31\\ 72 \div 31 &=&2\cdot 31+10&=&2 \text{ mod } 10\\ 31 \div 10 &=&3 \cdot 10+1&=&3 \text{ mod } 1 \end{array}

and now going backward:

\begin{array}{rcl} 1&=&31-3\cdot 10\\ &=& 31-3\cdot(72-2\cdot 31 )\\ &=& 31-3\cdot72+3\cdot 2\cdot 31 \\ &=& 7\cdot 31-3\cdot72\\ &=& 7\cdot (175-2\cdot72)-3\cdot72\\ &=& 7\cdot 175-7\cdot2\cdot72-3\cdot72\\ &=& 7\cdot 175-17\cdot72 \end{array}

so we have \(x=7\) and \(y=-17\) so that

\(1=x\cdot m+y\cdot n=7\cdot 175+(-17)\cdot 72\)

background math (specifically, number theory)

  • terminology: numbers \(a\) and \(b\) are prime relative to one another if their greatest common divisor is 1.
  • Euler's \(\phi\) function for a number \(n\) is the number of natural numbers that are prime relative to \(n\)
  • Some extra results.
    • For a prime number \(n\) we have \(\phi(n)=n-1\)
    • For a prime numbers \(n,n\) we have
      \(\phi(n\cdot m)=(n-1)(m-1).\)

background math (specifically, number theory)

Ingredients for encryption

  • pick \(n\) 
    • not just any actually. rather, we pick two prime numbers \(q\) and \(p\) so that \(n=qp\)
  • compute \(\phi(n)\)
    • with primes, \(\phi(n)=\phi(q)\cdot\phi(p)=(q-1)\cdot (p-1).\)
  • Pick e that is prime relative to \(\phi(n)\) 
  • Find \(x,y\) such that \(x\cdot e+y\cdot\phi(n)=1\)

What do we want to do?

  1. Take a word with letters \(l_1,\cdot l_m\)
  2. Convert letters to numbers \(w_1,\ldots,w_m\)
  3. Encode the numbers \(w_i\to\tilde{w}_i=w^e\text{ mod } n\)
  4. Prove that they can be decoded uniquely.

What do we send?

  1. The letters \(\tilde{w}_1,\ldots,\tilde{w}_m\)
  2. The numbers \(n, e\)
  • Finding \(x\) and \(y\), which are necessary for an outsider is computationally NP-hard.
  • Why? Because prime factorizations are NP-hard! 

How do you decrypt?

Calculate \({\tilde{w}_i}^x\text{ mod } n.\)

Now the RSA components

  1. pick two large prime numbers, \(p\) and \(q\)
    • ​​\(p=53\)
    • \(q=59\)
  2. Compute \(n=pq​\)
    • \(​​53\cdot59=3127=n\)
  3. Find Euler's Phi for \(n\)
    • \(\phi(3127)=(53-1) (59-1)=3016\)
  4. ​Select a small number e that is prime to \(\phi(n)\) 
    • e=3 (3016 mod 3=1)
  5. ​​Find the \(x,y\) s.t. \(xe+y\phi(n)=1\):
    • \(2011\cdot 3-2\cdot 3016=1\)

Sa

Pa

S=(x,n)=(2011,3127)

P=(e,n)=(3,3127)

Now the RSA components

for greater satisfaction, the formal argument for the encrypter

\begin{array}{rcl} {\tilde{w}}^x\text{ mod }n&=&(w^e)^x\text{ mod }n\\ &=&w^{ex}\text{ mod }n\\ &=&w^{1-y\phi(n)}\text{ mod }n\\ &=&w\left(w^{\phi(n)}\right)^{-y}\text{ mod }n\\ &=&w1^{-y}\text{ mod }n\\ &=&w\text{ mod }n\\ &=&w. \end{array}

(assumes \(n\) is large and \(w\) small)

this step uses Euler's theorem

Application of RSA to sending an encrypted message

  • suppose message is M
    • M=89
  • encrypting is
    • Enc(M,P)=P(M)=
    •  
  • decrypting is
    • Dec(C,S)=S(C)=
    •  

Sa

Pa

S=(x,n)=(2011,3127)

P=(e,n)=(3,3127)

M^e \text{ mod } n
C^x \text{ mod } n
89^3 \text{ mod } 3127=1394=C
1394^{2011} \text{ mod } 3127=89

Application of RSA to sending digital signatures

Example parameters

  1. Choose p=53, q=59
    • ​n=3127
    • phi(n)=3016
  2. ​Choose e=3
    • ​​pick x=2011
  3. P=(3,3127), S=2011
  4. Message M=89
  • Signature:
     
  • Validation:
     
sig=M^x \text{ mod } n=89^{2011} \text{ mod } 3127=545
check=Sig^e \text{ mod } n=545^{3} \text{ mod } 3127=89=M

Putting it together

  • Hashing is used widely and provides convenient outputs of fixed length => concise representation of information
  • Public-private key signatures ensure that you can prove ownership and that there is security
  • Encrypting messages is a lesser concern for blockchains => info on transaction is supposed to be out in the open
  • But: all the theory used numbers not letters - how does that go together?
  • Answer: you can present any text by a number using ASCII code!
    • Long Text => H(Long text)
    • H(Long text) = fixed length text
    • => ASCII(H(text)) = number of fixed length.

Cryptography with math

By Andreas Park

Cryptography with math

This is the slide deck that I use for a quick introduction to the Decentralized Finance class.

  • 301