Blockchain
Technology

Andreas Park

Part 2: Drilling down

Part 1: 30,000 ft overview

  • Motivation and background
  • Blockchain architecture
  • basic functionality
  • demo of workflow in a blockchain

Part 2: Drilling down

  • components of blockchain transactions
  • keys, signatures, and addresses
  • hashing
  • consensus protocols
  • mining

Part 3: Moving along

  • blockchain privacy & zero knowledge
  • scaling solutions
  • Proof of stake vs proof of work
  • alternative approaches: EOS and Cardano, Linux Hyperledger, Corda

Part 4: Smart contracts

  • ERC20 tokens as a new finance tool
  • basic solidity demo

Schedule of Topics

Objectives

  • Understanding how value transfers work today
  • Demo of a blockchain transfer
  • Blockchain as a database
  • Work flow and block/blockchain architecture
  • Demo of workflow 
  • Identifying what areas we need to drill down in

Last Lecture: 30,000 ft Overview

Post Demonstration:

What do we know and

What do we need to drill down on?

  1. Block = text
  2. Blocks are linked by hashes => what is a hash
  3. Transactions are associated with addresses? => What is an address?
  4. Transactions are signed? => How does signing work?
  5. Block validation requires a "nonce" => what is that all about?

Objectives

  • components of blockchain transactions
  • keys, signatures, and addresses
  • hashing
  • consensus protocols
  • mining
  • A warning: this lecture is going to be a little mathy

Part 2:

Drilling Down

A close look at a real Ethereum transaction

A close look at a real Bitcoin transaction

A close look at a Ethereum block

A close look at a Bitcoin block

Hashing

  1. What is hashing?
  2. Why do we use hashing?

Definition

  • M: a message/text of arbitrary length
  • h(M): a fixed length output or "digest"

What is cryptographic hashing?

What is cryptographic hashing?

Properties

  1. Deterministic (i.e., not random)
    • the same message always generates the same digest
  2. Fast
    • you don't need much time/many computing cycles to compute a hash
  3. "unpredictable"
    • if two messages M and M' are similar, their digests should look very different
  4. not invertible
    • there is no inverse function, i.e., there is no functional form h^-1(M) such that one can infer M from h(M),  nor can an attacker find M from h(M) in "normal" (polynomial) time.
  5. Collusion resistant
    • an attacker cannot find two messages M and M' such that h(M)=h(M') in "normal" time.

What is cryptographic hashing?

Simple Application

  • Databases should not store user passwords and usernames in plain text
    • => attacker could immediately impersonate every user
  • Store as a hash: attacker cannot invert the username & password

Nerdy stuff:

  • read up on P vs NP hard problems
  • Idea: If a solution to a problem is easy to check, is the problem easy to solve?
    • We don't know the answer yet.

What hashing functions are there?

  • Many!
  • MD5
  • SHA1 (better than MD5)
  • SHA256 (better than MD5)
    • output of 256 bits; 4 bit= 1 characters => 64 characters
    • developed by the NSA
    • Code, e.g., https://www.movable-type.co.uk/scripts/sha256.html
  • SHA512
  • RIPEMD-160 (for "160 bit output)

What hashing functions are used with blockchains?

  • MD5
    • cac58b5234e1f98b4c956998b8ac2e26
  • SHA1
    • 60D795AC720DEB5B29AB44F3A690A90DDF147D75
  • SHA256 
    • 9EEA6242471F9B3999F21C6FE247679CAB1EAE0B6E8431A3A1A5FAADB27051C8

Examples of "Andreas"

  • MD5
    • bcc9d898264b67515fba62598bdc58c0
  • SHA1
    • DE2737F0E88865DCDF7A33848F664FF807A3208C
  • SHA256 
    • 62D3869E008362B2DD4490D8BF9D4AFC4CED4FF34045709DD1EEAA743CF5C793

Examples of "AnDrEaS"

What hashing functions are used with blockchains?

Problem: Hashes can be cracked!

cracked by "CrackStation"

Why are hashing functions used with blockchains?

  1. efficient way to represent data
    • always same-length output
    • => good convention
  2. small changes to data trigger large changes in hash
    • (recall the demonstration)
    • => easy to check consistency 
  3. they work as "pointers"
    • each block contains a hash of the past block
    • this hash is a pointer
    • pointers make searches easy
  4. Hashes of hashes are used to simplify data storage
    • the process of hashing hashes repeatedly creates the "Merkle Tree" 

Merkle Tree

Repeated Hashing to produce concise transaction digest

Source: https://github.com/cliftonm/MerkleTree

Idea

  1. Hash all transactions
  2. Concatenate hashes of "neighbors" and hash them
  3. repeat

The Merkle Root

the leafs

the branches

Merkle Tree

Why convenient?

Source: https://github.com/cliftonm/MerkleTree

  • fast way to check if a transaction T is in a group
  • fast way to check if an authority is legitimate

The Merkle Root

the leafs

the branches

Example: Merkle Tree

https://www.codeproject.com/Articles/1176140/Understanding-Merkle-Trees-Why-use-them-who-uses-t

your record

assume you know the hash of the root

Example: Merkle Tree

  1. Ask the system to verify that your record 2 is part of the tree
  2. System returns hashes of 3, 01, and 4567
  3. You determine
    • H(2)+H(3) => H(23)
    • H(01)+H(23)=H(0123)
    • the root 01234567 and check against your H(01234567)

Cryptography

  • Foundation: let's understand the secure sending of information
  • Problem: send message M that you want no-one to be able to read 
  • Basic idea (just as with hashing):
    • should be easy to decrypt with the right tools
    • hard to decrypt without it

Some formalism

  • message M 
  • public key P
  • private key S (for "safe")
  • cipher text C
  • Two functions:
    • encode message: enc(M,P)=p(M) 
    • decode cipher: dec(C,S)=s(C)

Alice wants to send Bob money without Charles seeing it

Symmetric Encryption: Bob and Alice use the same key to encrypt and decrypt a message

Formally: public key P = private key S

Asymmetric Encryption: Bob has a public and a private key, (Pb Sb)

Pb

Sb

Pb

Sb

Digital Signatures

  • Problem: send message M and ensure that the other side believes that you sent this particular message
    • worry about manipulation
    • other side may worry about proving what you did, etc.
    • => want to digitally sign the message
  • Again:
    • should be easy to prove that you signed 
    • hard to forge your signature

Digital Signatures

Formally

  • Components:
    • Message M
    • Signature or Tag T
    • Public & private keys P & S
  • Two functions
    • Sign(M,S)
    • Check(M,T,P)

required property

if S applied to M created T, T=Sign(M,S) => Check(T,M,P)=1

Alice wants to send Bob a message and provide proof that its her.

Sa

Pa

formally: computes T=Sign(M,Sa)

formally: computes check(T,M,Pa)

This is a link to another Anders Brownworth's Demo Video

Concrete Application of Signatures: RSA Asymmetric Key Cipher

  • Based on Rivest-Shamir-Adleman algorithm
  • widely used, e.g. for online banking etc
  • not exactly used in Blockchains (they use "elliptic curve" algorithms),
  • but the idea is similar and RSA is easier to explain
  • Warning: this will be the mathiest and geekiest part of the class
  • I'll do it by example.

background math (specifically, number theory)

  • terminology: a is prime relative to b if their greatest common divisor is 1
  • Euler's Phi function for a number n is the number of natural numbers that are prime relative to n
  • for m,n prime numbers 
n
1 1
5 4
10 4
12 4
14 6
15 8
\phi(n)

A task that'll come up

  • pick n
  • compute 
  • pick e that is prime relative to 
  • find d such that

 

 

\phi(n)
\phi(n)
\frac{d\cdot e -1}{\phi(n)}=\text{an integer}

trick: pick k and then find d s.t.

e\cdot d= 1+k \cdot \phi(n)
\phi(n)=n-1 \text{ and } \phi(m\cdot n)=\phi(n)\cdot\phi(m)=(n-1)(m-1)

Now the RSA components

  1. pick two large prime numbers, p and q
    • ​​p=53
    • q=59
  2. Compute n=pq
    • ​​53x59=3127=n
  3. Find Euler's Phi for n
    •  
  4. ​Select a small number e that is prime to 
    • e=3 (3016 mod 3=1)
  5. ​​Find multiplicative inverse d of e mod
    for k=2, d=(2x3016+1)/3=2011

Sa

Pa

S=(d,n)=(2011,3127)

P=(e,n)=(3,3127)

\phi(3127)=(53-1) (59-1)=3016
\phi(n)
\phi(n)

Application of RSA to sending an encrypted message

  • suppose message is M
    • M=89
  • encrypting is
    • Enc(M,P)=P(M)=
    •  
  • decrypting is
    • Dec(C,S)=S(C)=
    •  

Sa

Pa

S=(d,n)=(2011,3127)

P=(e,n)=(3,3127)

M^e \text{ mod } n
C^d \text{ mod } n
89^3 \text{ mod } 3127=1394=C
1394^{2011} \text{ mod } 3127=89

Application of RSA to sending digital signatures

Example parameters

  1. Choose p=53, q=59
    • ​n=3127
    • phi(n)=3016
  2. ​Choose e=3
    • ​​pick d=2011
  3. P=(3,3127), S=2011
  4. Message M=89
  • Signature:
     
  • Validation:
     
sig=M^d \text{ mod } n=89^{2011} \text{ mod } 3127=545
check=Sig^e \text{ mod } n=545^{3} \text{ mod } 3127=89=M

Putting it together

  • Hashing is used widely and provides convenient outputs of fixed length => concise representation of information
  • Public-private key signatures ensure that you can prove ownership and that there is security
  • Encrypting messages is a lesser concern for blockchains => info on transaction is supposed to be out in the open
  • But: all the theory used numbers not letters - how does that go together?
  • Answer: you can present any text by a number using ASCII code!
    • Long Text => H(Long text)
    • H(Long text) = fixed length text
    • => ASCII(H(text)) = number of fixed length.

Cryptography is nice but not enough!

  • Order of transactions?

  • Cancel one before the other?

  • When is it in the "database"?

Cryptography is not enough for value transfers!

Sa

Pa

formally: computes T=Sign(M,Sa)

formally: computes check(T,M,Pa)

Consensus Protocol

Recall: Blockchain is like a distributed database

  • Public-private key signatures ensure that (in principle) no one can impersonate someone else and steal money
  • Hashing links information and ensures that the history can't be changed
  • But how does the network agree on what transactions to include?

Byzantine Generals' Problem

Problem

  • They need a coordinated attack.
  • Messengers can be intercepted and compromised.
  • No general is the leader.

Possible Solutions

  • Send messages that are easy to check
  • hard to forge
  • plus: send many messages

Proof of Work Protocol

How does the proof of work protocol address Byzantine Fault Resistance?

  1. Looking for leading zeros is a coordination device
    • If you see such a message update what you want to do.
  2. Mining is difficult and time consuming
    • a forger needs to work hard to change the message
  3. Message is send to many entities
    • strength in numbers: a forger needs to capture many messages 
    • unless there are too many forgers (there is a math result here) can trust

Importance of Economics, Step 1: Incentive to support longest chain

B3

B1

B2

B4

B5

B6

Where to add a new block B7?

  • Add to B3?
    • => people after still more likely to add to B6
    • lose "coinbase" reward

Economic Analysis, Part I

Equilibrium for "the longest chain"?

    • Yes.
      • "Blockchain Mining Games" by Kiayias,Koutsoupias, Kyropoulouz, and

        Tselekounis, Proceedings of the 2016 ACM Conference on Economics and Computation, 2016

      • "The blockchain folk theorem" by Biais, Bisière, Bouvard, and Casamatta, RFS 2018

Importance of Economics, Step 2: Altering the past?

B3

B1

B2

B4

B5

  • needs to be faster than anyone after who adds to B5 and build a longer chain

B8

B7

B9

B10

B6

Contains transaction from Bob to Alice

Bob wants to undo the transaction by rewriting history with B6

Bob's objective

  • Wants to undo this trade and cheat Alice by building alternative chain from B6

What does it take?

  1. needs to be predictably able to add several blocks to the chain without interference, or
  2. needs to be faster than anyone after who adds to B5 and build a longer chain, or
  3. needs to ability to reject new blocks that are added to B5 .

How does Proof of Work prevent this?

  • mining success is random subject to resources spend:
    • computers/GPUs
    • electricity
  • you need faster/more computers than 51% of the network
    • current network power: 25million tera-hashes per second (blockchain.info)

Back of the envelope calculation

  • hashrate: 25,000,000 TH/s
  • best GPUs have 2.5GH/s per card=0.0025 TH/s
  • => need 25,000,000 x 400 x 0.5 = 5,000,000,000 GPUs
  • 1 GPU costs around $200
  • =>Cost = $1,000,000,000,000

Economic Analysis, Part II

Double spend attack prevention

  • Validation rewards are taken as given, but they are crucial in
    • determining incentives to participate,
    • to support the chain, and
    • to expense electricity and computing power

Basic idea of competitive equilibrium

aggregate mining cost = aggregate reward

Double spending attack

  • expense resources but:
  • win N block rewards until "confirmation" block
  • ability to double-spend

condition that prevents it

(Chiu & Koeppl RFS 2018)

 

 

\text{mining reward} \times (N+1)N > \text{double spend amount}

Bigger Insights

  • Philosophy: anyone is allowed to write to the ledger, but we don't trust anyone
  • Key requirement: you cannot predict or determine when you'd write again
    • => cannot manipulate/take over
  • PoW achieves "randomness" by solve a hard, expensive cryptographic puzzle
  • much work goes into finding viable alternatives, which we will discuss later
    • Proof of stake, Cardano, Proof of Burn, Proof of Elapsed Time

Major innovation of bitcoin

  • combine cryptography, blockchains, and proof-of-work
  • first application of PoW: HashCash
  • Blockchains have been used for timestamp servers (with central authority)

Bitcoin and Ethereum Addresses

My JAXX addresses:

  • Bitcoin:
    • 16dNbpPnA5vz41f6D8iQsBwN9j8G6YW7zP
  • Ethereum
    • 0x91c44e74ebf75baa81a45dc589443194d2eba84b
  • General rule: Bitcoin commonly start with 1 (other models exist), Ethereum addresses with 0x

Bitcoin Addresses

  • Usually, your wallet creates your public/private keys
  • Recall that Bitcoin uses an "Elliptic Curve" algorithm
    • With RSA pick primes, e and then d
    • With Elliptic curves, you "pick points on a curve"
    • => your wallet does this and then creates the keys for you
  • From these items it creates your address through a series of hashing operators, mixed with adding "stuff"

Bitcoin Addresses (nerdy stuff)

Ethereum Addresses

  • are essentially a hexidecimal string that always starts with 0x
  • Like Bitcoin, get public/private key with the Elliptic Curve algo
  • Hash public key with Keccak-256.
    • ​=> 32-byte string.
  • Drop first 12 bytes
    • 20 bytes = 40 character remain
  • Add prefix 0x.
    • ​=>your Ethereum address
  • ​Note:
    • not as crafted as Bitcoin
    • ​Ethereum is more flexible and works to to use a name-representation
    • => interaction with the International Bank Account Number (IBAN) System

Objectives

  • deep understanding components of blockchain transactions:
    • hashing
    • keys, signatures, and addresses
    • consensus protocols
    • mining & economic incentives

End of Part 2:

Drilling Down

  • blockchain privacy & zero knowledge
  • scaling solutions
  • Proof of stake vs proof of work
  • alternative approaches: EOS and Cardano, Linux Hyperledger, Corda
  • Tokens as a new form of finance

Coming up Part 3:

Moving Along

Blockchain Module Part 2: Drilling down

By Andreas Park

Blockchain Module Part 2: Drilling down

This deck is for the second of four lectures on Blockchain technology in finance, taught at the Rotman School of Management, Spring 2018. The pre-recorded version is available here: https://www.youtube.com/playlist?list=PLTmzBTSqnXdvhYGdCUzLM4r0r0K_jX3vx

  • 1,498