Derek Soeder is a senior threat researcher at Cylance. In this post, Soeder discusses how he and his team reverse engineered ransomware to recover the password used to encrypt client files. In the interest of length, some of the technical steps the team took have been omitted from this story, but the original is now available on the Cylance website. Read on to find out just what Soeder had to do to retrieve client data being held hostage by ransomware.
In early 2013, an organization approached us for help recovering their data after their server was infected with a version of the ACCDFISA ransomware. The malware had combed through every file on every drive and encrypted all the important files. Restoring from backups was out of the question because the backup drives had been mounted on the server—the malware got to them, too. It was a total loss.
The malware had turned each file into a WinRAR self-extractor containing an encrypted RAR and the filename contained instructions: (!! To decrypt email id … to …@...!!).exe This variant claimed to use AES-256 encryption with a 256-character random key generated individually for each victim and stored remotely with the attacker. It allegedly had securely deleted all the unencrypted originals and password files to stymie efforts to recover them from the hard drive.
Trying to find a flaw in WinRAR's cryptographic implementation didn't seem like the most promising avenue of attack; instead, we turned our attention to cracking the password. For that, we needed the code that created it.
Finding the Password Generator
After a little poking around a copy of the infected server’s drive, we found the malicious program and a number of strange files that seemed to be associated with it. We found Microsoft Sysinternals' sdelete utility, which permanently deletes files, a "NoSafeMode" library, and a RAR utility for making self-extractors. The RAR utility was our starting point for reverse engineering the ransomware.
The utility accepted the encryption key password as a command-line argument, so we speculated we would find the password generator by backtracking through the ransomware code which launches the utility. Let's take a look.
First, we ran the ransomware on a disposable system. We attached a debugger to intercept the CreateProcess calls that launch the RAR utility, disguised as svchost.exe. This way, we could see each command executed by the ransomware and eventually the password being used to encrypt the files. The intercepted password, aesT322B2XgM(mC0..., was 57 characters long. The mixture of uppercase and lowercase letters, digits, and punctuation were too random to have been generated by a button-mashing person. The fact that the password began with aes could have just been a coincidence or an intentional prefix. If intentional, we expected to find the string somewhere in the ransomware code. When we opened the ransomware in a disassembler, we found not just aes, but rather aesT322. This told us the password was actually aesT322 followed by fifty presumably ransom characters.
We knew the password we intercepted would not necessarily be the same password used to encrypt the client’s files. But we now had some clues to guide our next reverse engineering efforts: We could look for the password itself or pieces of it, search for the likely alphabet used to generate the password, and hunt for the string used to build the password portion of the command line.
We used the debugger to see what values are being assigned to global variables while the ransomware is running. With a bit of reverse engineering, we identified one global variable that pointed to the aesT322 prefix string, another that received the string of 50 random characters, and a third that received the full 57-character password.
Figuring Out How the Generator Worked
We learned a lot about how the ransomware worked, but at this point we had no indication yet that we were going to be able to help the victim. In fact, we'd ruled out the easiest possibility that the password might just be a fixed string used against all victims. We were on the way, though.
By tracking every instance of the three variables in the program, we found a loop which selected fifty random characters from a 78-character alphabet comprised of 26 lowercase letters, 26 uppercase letters, 10 digits, and 16 punctuation symbols (with some repetitions). Computers struggle to be truly random. One way to defeat encryption is to attack the “pseudo-random number generator” (PRNG), and that’s exactly what we decided to do.
We eventually discovered that the PRNG was initialized, or seeded, with a 32-bit number derived from the identifier of the thread executing this code and how long the system has been running in milliseconds. They are both rather predictable values. Thanks to the 32-bit seed, we now knew there could be at most four billion possible passwords, not the nonillions of nonillions of possibilities that fifty characters picked truly randomly from an alphabet of 78 would have yielded.
This is because, as noted before, computers usually operate with rigid determinism even when they're trying to act random. For any given seed value, the PRNG will produce the exact same numbers in the exact same order any time it's initialized with that value. Since the seed was a 32-bit number, the possibilities ranged from zero to about four billion, and therefore the realm of possible initial states is equally confined.
The problem with guessing passwords is that it's expensive in terms of time. And that’s what we had to do next.
Guessing Passwords
A list of four billion passwords is no trifling thing. But we also know the seed relies on a thread ID, which is a multiple of four and typically less than 10,000, and the system uptime. Over the course of 49.7 days, the uptime will count from zero to four billion and then wrap around to become zero again. We would need to get an idea of how long the server had been running before the attack—perhaps a file timestamp or an event log entry—to narrow down the possible values. But we discovered something even better.
Among all the strange files, we’d found one in the ProgramData directory called stppthmainfv.dll, containing 21 lines of eight random letters each. We decided to work backwards and brute-force all possible seed values to figure out which one would cause the PRNG to generate those 21 lines. Because this was a 32-bit string, we knew the search should not take more than a few hours on a reasonable computer. That would make it orders of magnitude faster than trying to feed all potential passwords to the RAR utility.
In about four seconds on a single CPU core, the code tested 31,956,209 possibilities and found the seed value 31,956,208 generated the same sequence of letters listed in stppthmainfv.dll. We generated a roughly 12MB list of passwords to test (much better than the 236GB text file holding all four billion possibilities) with the seed value. We ran a batch process overnight where each password was tested against the RAR utility.
By morning it had finished with exactly one password waiting for us on the screen--the correct password. Our experiment was a success! We had earned the victim's faith in us, and at last, we were ready to get their data back.