Ethereum Smart Contracts in EVM-assembly
Structure
- Expectations
- Stack-based languages
- EVM op-codes
- Tooling
- Branching
- Interaction with other contracts
- Deployment
- Functions
- Advanced features
- Expectations
- Stack-based languages
- EVM op-codes
- Tooling
- Branching
- Interaction with other contracts
- Deployment
- Functions
- Advanced features
- Read & Write EVM-assembly code
- Debug the bytecode
- Know the tooling
Why would you do that?
- Safety
- Gas-cost
- Fun!
pragma solidity ^0.4.22;
contract HelloWorld {
function() public {
log0(bytes32("Hello world!"));
}
}
solc ./HelloWorld.sol --bin-runtime
HelloWorld.sol
6080604052348015600f
57600080fd5b50604051
80807f48656c6c6f2077
6f726c64210000000000
00000000000000000000
0000000000815250600c
01905060405180910390
a00000a165627a7a7230
5820860d44530a302ca6
28b1e865093d54f53da6
be5e6650b84b5440b574
6222d7b60029
Expectations- Stack-based languages
- EVM op-codes
- Tooling
- Branching
- Interaction with other contracts
- Deployment
- Functions
- Advanced features
10 50 +
ARG ARG OP
10 50 +
10
50
60
Operators pop arguments from the stack
Operators can push things onto the stack
ExpectationsStack-based languages- EVM op-codes
- Tooling
- Branching
- Interaction with other contracts
- Deployment
- Functions
- Advanced features
Arithmetic
0x01: ADD
0x02: MUL
0x03: SUB
0x04: DIV
0x05: SDIV
0x06: MOD
0x07: SMOD
0x08: ADDMOD
0x09: MULMOD
0x0a: EXP
0x0b: SIGNEXTEND
Comparison
0x10: LT
0x11: GT
0x12: SLT
0x13: SGT
0x14: EQ
0x15: ISZERO
0x16: AND
0x17: OR
0x18: XOR
0x19: NOT
0x1a: BYTE
Environment
0x30: ADDRESS
0x31: BALANCE
0x32: ORIGIN
0x33: CALLER
0x34: CALLVALUE
0x35: CALLDATALOAD
0x36: CALLDATASIZE
0x37: CALLDATACOPY
0x38: CODESIZE
0x39: CODECOPY
0x3a: GASPRICE
0x3b: EXTCODESIZE
0x3c: EXTCODECOPY
SHA3
0x20: SHA3
Block
0x40: BLOCKHASH
0x41: COINBASE
0x42: TIMESTAMP
0x43: NUMBER
0x44: DIFFICULTY
0x45: GASLIMIT
Data
0x50: POP
0x51: MLOAD
0x52: MSTORE
0x53: MSTORE8
0x54: SLOAD
0x55: SSTORE
0x56: JUMP
0x57: JUMPI
0x58: PC
0x59: MSIZE
0x5a: GAS
0x5b: JUMPDEST
Push
0x60: PUSH1
...
0x7f: PUSH32
Duplicate
0x80: DUP1
...
0x8f: DUP16
System
0xf0: CREATE
0xf1: CALL
0xf2: CALLCODE
0xf3: RETURN
0xf4: DELEGATECALL
0xff: SELFDESTRUCT
Swap
0x90: SWAP1
...
0x9f: SWAP16
Log
0xa0: LOG0
0xa1: LOG1
0xa2: LOG2
0xa3: LOG3
0xa4: LOG4
Precompiled contracts
- Contract addresses
- 0x01 to 0x08
- Invoke with CALL
Memory
- Only exists for execution
- MLOAD
- MSTORE
Storage
- Persists between calls
- SLOAD
- SSTORE
- CALLCODE
Let's write some code!
But how?
ExpectationsStack-based languagesEVM op-codes- Tooling
- Branching
- Interaction with other contracts
- Deployment
- Functions
- Advanced features
Please prepare/install:
- Editor of your choice
-
github.com/coblox
- Clone ethereum_workshop_31_07_2018
- Run `source setup.sh`
solc
Workflow
- Write op codes in hex
- Run with `evm-run XXXXXX`
- See what happens
Exercise
Compute the number of seconds in a non-leap year
(365 days)
603C
603C
02
6018
02
61016D
02
Formula:
- Calculate seconds in a day
- Multiply it by 365
- 365 doesn't fit in a bit
- Use 2 byte push
Advanced workflow
- Write EVM-assembly (not OP-codes)
- Use `asm-compile` to compile to OP-codes
- Run with `evm-run`
- Shortcut: `asm-run`
EVM-assembly
- High-level assembly language
- Supports reverse and regular polish notation
- Implicit PUSH operator
- Labeled jumps (we will get to that later)
- Decimal numbers
{
0x05 // You can use hex
15 // But also dec
add // Call op-codes by name
// Need to have an empty
// stack at the end
pop
}
Shell functions
- `asm-compile`
- `evm-run`
- `asm-run`
Do the exercise again, this time in EVM-assembly
{
// Compute seconds in a day
60
60
mul
24
mul
// Push 365 to the stack
365
mul
// Need balanced stack at the end
pop
}
Prefix notation
{
mul(60, 60)
}
Break!
ExpectationsStack-based languagesEVM op-codesTooling- Branching
- Interaction with other contracts
- Deployment
- Functions
- Advanced features
Conditionals in assembly?
JNZ
Jump If Not Zero
JUMP
- Pops destination address off the stack
- Continues execution at destination
JUMPDEST
- Neither JUMP nor JUMPI can jump to arbitrary addresses
- Destination address has to be a JUMPDEST instruction
JUMPI
- Pops destination address and conditional off the stack
- If condition != 0, continues execution at destination address
Demo
Write a Math.max function that returns the higher number
Needed functionality
- Read input (CALLDATACOPY)
- Memory access (MLOAD)
- Comparison (GT)
- Jumps (JUMPI)
- Return data (RETURN)
Live coding!
6020 6000 6000 37
6020 6020 6020 37
6000 51
6020 51
11
601d
57
6020 6000 f3
5b
6020 6020 f3
Labeled jumps!
{
jump(label) // Use function-syntax
// This is how you define a label
label:
selfdestruct(0x0)
}
Exercise
Build Math.max using
EVM-assembly
{
calldatacopy(0, 0, 32)
calldatacopy(32, 32, 32)
mload(0)
mload(32)
gt
greater_than
jumpi
return(0, 32)
greater_than:
return(32, 32)
}
Break!
ExpectationsStack-based languagesEVM op-codesToolingBranching- Interaction with other contracts
- Deployment
- Functions
- Advanced features
CALL
Arguments:
- Gas
- Contract-Address
- Wei
- Input-Buffer-Address
- Input-Buffer-Length
- Output-Buffer-Address
- Output-Buffer-Length
Identical to CALL but does not allow state modifications
STATICCALL
DELEGATECALL
- Introduced as bugfix in regards to "owner" problems
- Value of "msg.sender" in a chain of contract calls
CALLCODE
- Same arguments as CALL
- Called contract has access to the calling contract's storage
Personal opinion
- CALL and STATICCALL are useful
- Don't allow other contracts to modify YOUR storage
- Pass data through Input-Buffer and receive data through Output-Buffer
- Always check the return value
- Successful execution pushes 1 onto the stack
- Error during execution pushes 0 onto the stack
Precompiled contracts
Hash data with SHA256
call(
0x48,
0x000000000000000000000000000000000000002,
0x00,
0x00,
0x20,
0x21,
0x20
)
Break!
ExpectationsStack-based languagesEVM op-codesToolingBranchingInteraction with other contracts- Deployment
- Functions
- Advanced features
How does Ethereum deploy contracts?
Transaction (simplified)
Property | Type | Description |
---|---|---|
From | Address | Sender |
To | Address | Receiver |
Data | Hex | Code to execute |
Gas | Wei | Execution cost |
Value | Wei | Value to transfer |
Deployment transaction
Property | Type | Value |
---|---|---|
From | Address | Your account |
To | Address | 0x0 |
Data | Hex | Deploy code |
Gas | Wei | Depends on code |
Value | Wei | Depends |
Deploy code
- Receiver MUST be 0x0
- Contract code is the RETURN VALUE
- "Deploy header" needed
Deploy header
- RETURN reads from memory
- Copy code to memory -> CODECOPY
dec(0x23): 35
dec(0x0b): 11
Contract code
Code size: 35
35
35
11
00
00
CODECOPY(00, 11, 35)
CODECOPY(MEM, FROM, TO)
RETURN(00, 35)
RETURN(FROM, TO)
ExpectationsStack-based languagesEVM op-codesToolingBranchingInteraction with other contractsDeployment- Functions
- Advanced features
ABI
Application Binary Interface
Design
- Uniquely identify functions
- Hash of function signature (name + argument types)
- first 4 higher-endian bytes of hash
Contract
execution
- first 4 bytes of DATA are the function identifier
- contract pseudo code:
- PUSH first fn id to stack
- compare with argument
- conditional jump
0xcdcd77c0
0000000000000000
0000000000000000
0000000000000000
0000000000000045
0000000000000000
0000000000000000
0000000000000000
0000000000000001
EVM-assembly and functions
- Hard to write by hand
- Error-prone
Single function contracts
- Contracts have a single function
- No ABI
- No function identifier
- Just the code that is executed
- Can be disposable: SELFDESTRUCT at the end
ExpectationsStack-based languagesEVM op-codesToolingBranchingInteraction with other contractsDeploymentFunctions- Advanced features
Advanced assembly features
- Variables & If
- Loops
- Functions
Variables & If
{
let mem_start_first := 0
let mem_start_second := 32
calldatacopy(0, mem_start_first, 32)
calldatacopy(32, mem_start_second, 32)
let first := mload(mem_start_first)
let second := mload(mem_start_second)
if gt(first, second) {
return(mem_start_first, 32)
}
return(mem_start_second, 32)
}
00 PUSH1 => 00
02 PUSH1 => 20
04 PUSH1 => 20
06 DUP3
07 PUSH1 => 00
09 CALLDATACOPY
10 PUSH1 => 20
12 DUP2
13 PUSH1 => 20
15 CALLDATACOPY
16 DUP2
17 MLOAD
18 DUP2
19 MLOAD
20 DUP1
21 DUP3
22 GT
23 ISZERO
24 PUSH1 => 1f
26 JUMPI
27 PUSH1 => 20
29 DUP5
30 RETURN
31 JUMPDEST
32 PUSH1 => 20
34 DUP4
35 RETURN
36 POP
37 POP
38 POP
39 POP
- non-intuitive result
- unnecessary op-codes
- ISZERO
- POP
00: PUSH1 0x20
02: PUSH1 0x00
04: PUSH1 0x00
06: CALLDATACOPY
07: PUSH1 0x20
09: PUSH1 0x20
11: PUSH1 0x20
13: CALLDATACOPY
14: PUSH1 0x00
16: MLOAD
17: PUSH1 0x20
19: MLOAD
20: GT
21: PUSH1 0x1d
23: JUMPI
24: PUSH1 0x20
26: PUSH1 0x00
28: RETURN
29: JUMPDEST
30: PUSH1 0x20
32: PUSH1 0x20
34: RETURN
Loops
{
for { let i := 0 }
lt(i, 2)
{ i := add(i, 1) } {
mstore(0, i)
log0(0, 32)
}
}
00 PUSH1 => 00
02 JUMPDEST
03 PUSH1 => 02
05 DUP2
06 LT
07 ISZERO
08 PUSH1 => 1e
10 JUMPI
11 DUP1
12 PUSH1 => 00
14 MSTORE
15 PUSH1 => 20
17 PUSH1 => 00
19 LOG0
20 JUMPDEST
21 PUSH1 => 01
23 DUP2
24 ADD
25 SWAP1
26 POP
27 PUSH1 => 02
29 JUMP
30 JUMPDEST
31 POP
- too complex to write manually
- really needed?
Functions
{
function even(value) -> result {
let remainer := mod(value, 2)
result := iszero(remainer)
}
}
00 PUSH1 => 13
02 JUMP
03 JUMPDEST
04 PUSH1 => 00
06 PUSH1 => 02
08 DUP3
09 MOD
10 DUP1
11 ISZERO
12 SWAP2
13 POP
14 POP
15 SWAP2
16 SWAP1
17 POP
18 JUMP
19 JUMPDEST
- might as-well JUMP to functionality
- really needed?
Break!
ExpectationsStack-based languagesEVM op-codesToolingBranchingInteraction with other contractsDeploymentFunctionsAdvanced features
pragma solidity ^0.4.22;
contract HelloWorld {
function() public {
log0(bytes32("Hello world!"));
}
}
Remember this?
Let's redo it in
EVM-assembly
Solidity
EVM-assembly
6080604052348015600f
57600080fd5b50604051
80807f48656c6c6f2077
6f726c64210000000000
00000000000000000000
0000000000815250600c
01905060405180910390
a00000a165627a7a7230
5820860d44530a302ca6
28b1e865093d54f53da6
be5e6650b84b5440b574
6222d7b60029
7f48656c6c6f20776f72
6c642100000000000000
00000000000000000000
00000060005260206000
a0
Solidity
EVM-assembly
- 116 bytes
- 32 bytes data
- 84 bytes program
- 41 bytes
- 32 bytes data
- 9 bytes program
Summary
- Single function contracts
- Keep your EVM-assembly simple
- No variables
- No loops
- No defined functions
- Always check the compiler output
Thank you!
We are hiring!
- Kick-ass developer?
- Interested in research?
- Want to shape the future?
- Not scared by bytes?
Contact us:
- philipp@tenx.tech
- thomas@tenx.tech
https://tenx.tech
https://coblox.tech
Ethereum Smart Contracts in EVM-assembly
By Thomas Eizinger
Ethereum Smart Contracts in EVM-assembly
Presentation for the workshop on EVM-assembly.
- 2,141