# Bytecode manipulations

AR, November 2017

## Who am I?

• Siberian
• Web developer with 10+ years of experience
• Python enthusiast (6+ years)
• Lead Full Stack programmer at

## Who am I?

• Siberian
• Web developer with 10+ years of experience
• Python enthusiast (6+ years)
• Lead Full Stack programmer at

# And so we code...

• a LPG operator?

# And so we code...

• a LPG operator?
`something <|> True`

less-pipe-greater operator

# And so we code...

• a LPG operator
`<|>`

less-pipe-greater operator

needs grammar patches

# And so we code...

• a LPG operator
• increments and decrements?

# And so we code...

• a LPG operator
• increments and decrements?
``````x = 1
y = x++ + ++x
print(y)``````

Hell Quiz Question #3

# And so we code...

• a LPG operator
• increments and decrements?
``````In [1]: def increment(value):
...:     return ++value
...:

In [2]: from dis import dis

In [3]: dis(increment)
2           0 LOAD_FAST                0 (value)
3 UNARY_POSITIVE
4 UNARY_POSITIVE
5 RETURN_VALUE
``````

# And so we code...

• a LPG operator
• increments and decrements

# And so we code...

• a LPG operator
• increments and decrements
• GOTO and labels!!!

# Code formatting

``````def method():
label: print("foo")
goto label``````

# Code formatting

``````def method():
label: print("foo")
goto label``````
``````def method():
label .here
print("foo")
goto .here``````

# Code formatting

``````def method():
label .here
goto .here``````
``````  2           0 LOAD_GLOBAL              0 (label)
3 LOAD_ATTR                1 (here)
6 POP_TOP

3           7 LOAD_GLOBAL              2 (goto)
10 LOAD_ATTR                1 (here)
13 POP_TOP

14 LOAD_CONST               0 (None)
17 RETURN_VALUE
``````

label .here == label.here

# Function bytecode

``````# Python 2.x
def function():
pass

code = function.func_code``````
``````# Python 3.x
def function():
pass

code = function.__code__``````

# Pure Bytecode

``````code.co_code
== 't\x00\x00j ... \x00S'   # Py2
== b't\x00\x00j ... \x00S'  # Py3

from opcode import opname, opmap

print(code.co_code[0])  # = 'LOAD_GLOBAL'
print(opmap['LOAD_GLOBAL'])  # = 116
opmap['LOAD_GLOBAL'] == code.co_code[0]  # = True ``````

# Opcode arguments

``````from opcode import HAVE_ARGUMENT  # == 90

opcode = code.co_code[i]

if opcode >= HAVE_ARGUMENT:
lo_byte = code.co_code[i + 1]
hi_byte = code.co_code[i + 2]
position = (hi_byte << 8) ^ lo_byte``````

# Cherchez la femme

``````opcode = code.co_code[i]
current = opname[opcode]
command = code.co_names[previous_arg_position]

if current == 'LOAD_ATTR' \
and previous == 'LOAD_GLOBAL':

if command == 'label':
# store label position
elif command == 'goto':
# same for goto``````
``````code.co_names
# == ('label', 'here', 'goto')``````

# Nope, nope, NOP!

Label:

• does nothing
• stores position
• old code = 3 + 3 + 1 instructions
• new code = 7 NOPs

Goto:

• jumps to label
• old code = 3 + 3 + 1 instructions
• new code = JUMP_ABSOLUTE (3 bytes) + 4 NOPs

# Code rebuild

``````from types import CodeType

new_code = CodeType(
code.co_argcount,
code.co_kwonlyargcount,  # py3 only
code.co_nlocals, code.co_stacksize, code.co_flags,
bytes(map(ord, codebytes_list)),  # string in py2
code.co_consts, code.co_names, code.co_varnames,
code.co_filename, code.co_name, code.co_firstlineno,
code.co_lnotab
)
``````

# Function rebuild

``````from types import FunctionType

rewritten = FunctionType(new_code, function.func_globals)  # py2
rewritten = FunctionType(new_code, function.__globals__)   # py3
``````

# We are hiring!

## Python 3.5, async, Tornado, Flask, Docker, Kubernetes, Prometheus, GitLab CI, ...

Bytecode manipulations

By Alex Rembish

• 1,598