Solving the web most popular code shortening competition in Python
Alessandro Amici - @alexamici - <a.amici@bopen.eu>
B-Open Solutions - http://bopen.eu
Abstract
“Code shortening” is the “sport” where participants strive to achieve the shortest possible source code that solves a programming problem by exploiting all the tricks and quirks of the language.
The SIZECON on SPOJ is one of the oldest and most popular code shortening problems on the web with a bizarre twist, only character above ASCII value 32 are counted for the penalty. During the talk we will take a journey into some frightening depths of the Python language in order to write shorter and shorter solutions to SIZECON until, exploiting a number of truly mind-blowing tricks, we will reach the current record solution of 28 characters (above ASCII 32!).
I promise I’ll show you the most obfuscated, contrived and sick python code you have ever seen and (hopefully!) will ever see. I invite participants to give SIZECON a try and check their score against the Python2 and Python3 SPOJ rankings.
SPOJ and SIZECON
SPOJ is a coding platform and an online judge:
- support for 45+ languages
- a huge trove of 20.000+ problems
- 50.000+ users
- user scores are public -> ranks
- solutions are not public -> can compete any time
SIZECON is a unusual "code golf" problem:
- created in 2005
- top20 most popular problems on SPOJ with 8000+ users including all languages, 1400+ for python
- our own Tim Peters is among the very best solvers, in Perl
SIZECON problem statement
SIZECON best solutions for Python 2
SIZECON best solutions for Python 3
The "python golf" master plan
-
correctness
- reference solutions
-
algorithm wizardry
- alternative algorithms
-
language wizardry
- shortened solutions
Reference solutions - 50-ish
T = int(raw_input())
r = 0
for _ in range(T):
n = int(raw_input())
if n > 0:
r += n
print r
Golf score: 107
SIZECON score: 70
print sum(max(0,input())for _ in range(input()))
Golf score: 48
SIZECON score: 44
Alternative algorithms
Nothing to see here*,
please move along.
* as long as you forget the bizarre ASCII special characters exception.
Shortened solutions - down to 33!
i=input
print sum(max(0,i())for _ in range(i()))
Golf score: 48
SIZECON score: 43
i=input
print eval("+max(0,i())"*i())
Golf score: 37
SIZECON score: 35
i=input
print sum(eval("max(0,i()),"*i()))
Golf score: 42
SIZECON score: 40
i=input
i(eval("+max(0,i())"*i()))
Golf score: 34
SIZECON score: 33
Child's play
Alternative algorithms
We can use as many ASCII special characters as we like...
Code legal characters
Alternative algorithms
We can use as many ASCII special characters as we like...
String literal legal characters
Alternative algorithms
Algorithm building blocks:
- build a string literal with lots of ASCII special characters
- turn the string literal into code
- actually do something with that code
original_solution = 'print sum(max(0,input())for _ in range(input()))'
encrypted_original_solution = ASCII32_encrypt(original_solution)
solution_template = "exec ASCII32_decrypt('{}')"
solution = solution_template.format(encrypted_original_solution)
with open('solution.py', 'w') as fp:
fp.write(solution_code)
exec ASCII32_decrypt(' ** encrypted string literal ** ')
Building ASCII32_decrypt
What we want
-
it get as input a string full with ASCII control characters
-
it outputs a string of python code
-
il must be short!
str.translate(table)
Return a copy of the string where all characters [...] have been mapped through the given translation table, which must be a string of length 256.
exec' ** encrypted string literal ** '.translate(' ** decrypt table ** ')
SIZECON score: 20 + # non ASCII control characters
translate based ASCII32_decrypt
original_solution = 'print sum(max(0,input())for _ in range(input()))'
chars = ''.join(set(original_solution) - set([' ']))
decrypt_table = ' ' + chars[:12] + ' ' + chars[12:] + ' ' * (254 - len(chars))
encrypted_original_solution = ASCII32_encrypt(original_solution)
solution_template = "exec'{}'.translate('{}')"
solution = solution_template.format(encrypted_original_solution, decrypt_table)
with open('solution.py', 'w') as fp:
fp.write(solution_code)
print sum(max(0,input())for i in range(input()))
SIZECON score: 20 + 18 # '(),0aefgimnoprstux'
print sum(max(int(),input())for i in range(input()))
SIZECON score: 20 + 17 # '(),aefgimnoprstux'
print sum(max(0,input())for _ in range(input()))
SIZECON score: 20 + 19 # '(),0_aefgimnoprstux'
This is a new, unusual shortening problem.
"SIZECON2" problem
Same as SIZECON, but score is the number of "different" characters with ASCII > 32
input(sum(max(int(),input())for i in range(input())))
SIZECON score: 20 + 17 # '(),aefgimnoprstux'
input(sum(n for n in(input()for i in repr(int())*input())if repr(int())*n))
SIZECON score: 20 + 14 # '()*efimnoprstu'
print sum(max(int(),input())for i in range(input()))
SIZECON score: 20 + 17 # '(),aefgimnoprstux'
...
How can we do better?
input(len(tuple(()for n in(input()*repr(int()))for i in repr(int())*input())))
SIZECON score: 20 + 13 # '()*efilnoprtu'
"SIZECON2" problem
"Downward is the only way forward."
exec"input(len(tuple(()for(n)in(repr((int())))*int(input())for(i)in(repr(int()))*int(input()))))"
SIZECON score: 20 + 12 # '"0124567\cex'
YES! 32!
'\160\162\151\156\164\040\042\110\145\154\154\157\040\167\157\162\154\144\041\042\073'
Octal representation of characters in literal strings!
exec"\160\162\151\156\164\040\042\110\145\154\154\157\040\167\157\162\154\144\041\042\073"
SIZECON score: 20 + 13 # '"01234567\cex'
'print "Hello World!";'
...
exec"\151\156\160\165\164\050\154\145\156\050\164\165\160\154\145\050\050\051\146\157\162\050\156\051\151\156\050\162\145\160\162\050\050\151\156\164\050\051\051\051\051\052\151\156\164\050\151\156\160\165\164\050\051\051\146\157\162\050\151\051\151\156\050\162\145\160\162\050\151\156\164\050\051\051\051\052\151\156\164\050\151\156\160\165\164\050\051\051\051\051\051"
How can we do better?
"Downward is the only way forward."
'\\160\\162\\151\\156\\164'
exec"exec\"\\151\\156\\160\\165\\164\\050\\154\\145\\156\\050\\164\\165\\160\\154\\145\\050\\050\\051\\146\\157\\162\\050\\156\\051\\151\\156\\050\\162\\145\\160\\162\\050\\050\\151\\156\\164\\050\\051\\051\\051\\051\\052\\151\\156\\164\\050\\151\\156\\160\\165\\164\\050\\051\\051\\146\\157\\162\\050\\151\\051\\151\\156\\050\\162\\145\\160\\162\\050\\151\\156\\164\\050\\051\\051\\051\\052\\151\\156\\164\\050\\151\\156\\160\\165\\164\\050\\051\\051\\051\\051\\051\""
'\\160\\162\\151\\156\\16'+'4'
'\\160\\162\\151\\156\\16'+repr(4)
'\\160\\162\\151\\156\\16'+repr(1+1+1+1)
'\\160\\162\\151\\156\\16'+`1+1+1+1`
repr(object)
Return a string containing a printable representation of an object. This is the same value yielded by conversions (reverse quotes).
SIZECON score: 20 + 12 # '"0124567\cex'
'print'
"Downward is the only way forward."
'\\160\\162\\151\\156\\16'+`1+1+1+1`
'\\160\\162\\151\\156\\1'+`1+1+1+1+1+1`+`1+1+1+1`
'\\160\\162\\151\\15'+`1+1+1+1+1+1`+'\\'+`1`+`1+1+1+1+1+1`+`1+1+1+1`
SIZECON score: 20 + 9 # '"+01\`cex'
exec"exec\"\\1"+`1+1+1+1+1`+"1\\1"+`1+1+1+1+1`+`1+1+1+1+1`+"1\\1"+`1+1+1+1+1`+"0\\"+...+"\""
...
'\\160'
'\\1'+`1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1`
SIZECON score: 20 + 8 # '"+1\`cex'
exec"exec\"\\1"+`1+1+1+1+1`+"1\\1"+`1+1+1+1+1`+`1+1+1+1+1`+"1\\1"+`1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1`+"\\"+...+"\""
YES! 28!
"That many exec's within exec's is too unstable."
Python interpreter: decrypt the ASCII32_encrypted string
exec level 1: build the numbers and compose the "string literal" string
exec level 2: parse the string literal
exec level 3: finally run the solution
SIZECON solution - Python interpreter
SIZECON solution - exec level 1
SIZECON solution - exec level 2
exec'\151\156\160\165\164\x28\145\166\141\154\x28\42\53\155\141\x78\x28\151\156\164\x28\51\54\151\156\160\165\164\x28\51\51\42\52\151\156\160\165\164\x28\51\51\51'
SIZECON solution - exec level 3
input(eval("+max(int(),input())"*input()))
SIZECON absolute best solutions!
"Your condescension, as always, is
much appreciated, thank you."
Alessandro Amici
<a.amici@bopen.eu>
@alexamici
http://linkedin.com/in/alexamici
B-Open - http://bopen.eu
Solving the web most popular shortening contest with Python - EuroPython 2015
By Alessandro Amici
Solving the web most popular shortening contest with Python - EuroPython 2015
“Code shortening” is the “sport” where participants strive to achieve the shortest possible source code that solves a programming problem by exploiting all the tricks and quirks of the language. The SIZECON on SPOJ is one of the oldest and most popular code shortening problems on the web with a bizarre twist, only character above ASCII value 32 are counted for the penalty. During the talk we will take a journey into some frightening depths of the Python language in order to write shorter and shorter solutions to SIZECON until, exploiting a number of truly mind-blowing tricks, we will reach the current record solution of 28 characters (above ASCII 32!). I promise I’ll show you the most obfuscated, contrived and sick python code you have ever seen and (hopefully!) will ever see. Full talk online at: https://www.youtube.com/watch?v=4-3zLTg3GKk
- 3,633