Completely Automated Public Turing test to tell Computers and Humans Apart
May 2018
David Magalhães
@speeddragon
David Magalhães
About me
@speeddragon
Software Engineer @
Security Analyst @
CloudFlare
https://www.owasp.org/images/0/03/ASDC12-Attacking_CAPTCHAs_for_Fun_and_Profit.pdf
Image Processing
March, 2016
https://www.blackhat.com/docs/asia-16/materials/asia-16-Sivakorn-Im-Not-a-Human-Breaking-the-Google-reCAPTCHA-wp.pdf
March, 2017
https://www.bleepingcomputer.com/news/security/researcher-breaks-recaptcha-using-googles-speech-recognition-api/
March, 2018
https://andresriancho.com/recaptcha-bypass-via-http-parameter-pollution/
POST /recaptcha/api/siteverify
recaptcha-response=anything%26secret%3dPUBLIC-TEST-BYPASS_TOKEN&secret=6LeYIbsSAAAAAJezaIq3Ft_hSTo0YtyeFG-JgRtu
https://andresriancho.com/recaptcha-bypass-via-http-parameter-pollution/
Around ~3% of the integrations with reCAPTCHA were vulnerable.
https://www.cloudflare.com/case-studies/troy-hunt/
POST https://www.google.com/recaptcha/api/siteverify
secret=6LeIxAcTAAAAAGG-vFI1TnRWxMZNFuojJ4WifJWe&response=03ACgFB9smWHeHsOPEDTTb-OWMh-SgQISvttCGdp4tN4OW77W9r3bEeIHwd22EyQOmB466kdBm3SD26fMPeKByeXHJSKERi81bcH1b68ZwUU7W4m2TsAs65KzjUaE7t2uMffOR...2kMo4msFdLmj79uTeeCWaHZl2o5QqnF22qAImMSbxWMeMx5gC0O8SQINkmuPexXPHnpUmpzaqgI_WlseJI_q5VrDA
{
"success": true|false,
"challenge_ts": timestamp, // timestamp of the challenge load (ISO format yyyy-MM-dd'T'HH:mm:ssZZ)
"hostname": string, // the hostname of the site where the reCAPTCHA was solved
"error-codes": [...] // optional
}
A website that didn't ask for captcha with valuable information.
... and 100.000 requests, something weird appear.
https://code.google.com/archive/p/kaptcha/
AJAX request didn't contain CAPTCHA response.
And for a couple of months, I didn't have a solution for this ...
... until ...
$ aptitude search ocr
GNU Ocrad is an OCR (Optical Character Recognition) program based on a feature extraction method. It reads a bitmap image in pbm or pgm formats and produces text in byte (8-bit) or UTF-8 formats.
Ocrad includes a layout analyser able to separate the columns or blocks of text normally found on printed pages.
https://savannah.gnu.org/projects/ocrad/
for ($i = 1; $i <= 9 ; $i++) {
$v[$i] = self::correctCaptcha(
trim(shell_exec("ocrad --threshold=0.".$i." ".$newFile))
);
}
Use various threshold to obtain a better result
Run on some captchas with know solution ...
? = e | %% = 2y | y = 2 | IT = n | T = 7 |
W = w | rf = d | ] = p | L = c | i = x |
t = p | lt\\ = m | v = y | z = 2 | unicode ... |
While session is enabled, we just need to solve one captcha.
Work in progress
POST http://www.example.com/get'
id=323184&gRecaptchaResponse={{gCaptchaToken}}
chrome.webRequest.onBeforeRequest.addListener(function(data) {
if (data.tabId == openedTabId
&& data.url != "http://www.example.com/") {
return {cancel: true};
}
}
},{'urls': ["*://*.example.com/*"]}, ["blocking"]);
var head = document.getElementsByTagName('head')[0];
var script = document.createElement('script');
script.type = 'text/javascript';
script.src = 'https://www.google.com/recaptcha/api.js?hl=pt_PT';
head.appendChild(script);
var body = document.getElementsByTagName('body')[0];
while (body.firstChild) { body.removeChild(body.firstChild); }
var div = document.createElement("div");
div.setAttribute("style", "float:left;");
div.setAttribute("class", "g-recaptcha");
div.setAttribute("data-sitekey", "1XLd32hUUA522B0Gx7htcAQmanD890ZyCCo2i5T");
body.appendChild(div);
if (document.querySelector(".recaptcha-checkbox") != null) { var delay = 3000 + Math.random() * 2000; // milliseconds setTimeout(function() { if (document.querySelector(".recaptcha-checkbox") != null) { document.querySelector(".recaptcha-checkbox").click(); } }, delay); }
https://www.google.com/recaptcha/api2/userverify?k=X8LdChUUA3AAAABgG302AQfn69kNDSnm23lbo
Of Captcha Breakers
https://github.com/JackonYang/captcha-tensorflow
https://medium.com/@ageitgey/how-to-break-a-captcha-system-in-15-minutes-with-machine-learning-dbebb035a710