In League with The /dev/null
We’ve set up our scanner nicely, but so far we haven’t actually done anything with the results. If the function you’re looking
for returns an int or some other simple type, we’ve just discarded it.
If you are outputting another file, you probably ran out of disk space:
The demo scan above generated 2,292 files @ 32.3MB each for a cool 73GB consumed disk space. Yikes.
Let’s add some checks and filters to the client process so that it can test if the return value contains or at least resembles the expected results.
The code to implement this is completely dependent on the target function at hand,
but I will demonstrate my approach for this encrypted file with some general rules of thumb.
If you are dealing with a pointer to a returned block of memory as in this case, there are some things to bear in mind:
- some functions will initialize a block of memory with all zeroes. Of the files generated by our scan, quite a lot of them were just all-zero, or had large all-zero areas
- some functions will copy the input, ignore the input arguments (because the real function being called doesn’t take any arguments) or otherwise have no noticeable effect, and the output will be the same as the input, or garbage
- some functions will initialize a fixed amount of memory, meaning the start of the return data will be zeroes or meaningful data, and the rest will be garbage
The upshot of all this is that if we’re working with a large data set – as we are here with a 32MB byte array –
you will want to try and avoid writing filters that examine the beginning of the data where possible. This is the
area that is most likely to trigger the most false positives when you try to write code that analyzes it for validity automatically.
That being said, you also need to be very careful not to be too aggressive with your filtering, otherwise you may accidentally
filter out the correct function – and indeed this happened to me as I was producing this example, since an area
I assumed would be zeroes in the decrypted data actually wasn’t. It is generally better to have to look through 100
results by hand than narrowing it down to 10, realizing none of them are correct and having to run the entire scan
again with looser filter criteria.
For illustration, I’ll describe how I filtered the results for this exercise. I started by identifying a
region of the file that should definitely change once it had been decrypted, by simply scrolling through it with a hex editor:
000C02D0 16 00 00 00 5B 27 02 00 10 00 00 00 6B 27 02 00
000C02E0 11 00 00 00 7C 27 02 00 0E 00 00 00 8A 27 02 00
000C02F0 12 00 00 00 9C 27 02 00 0E 00 00 00 AA 27 02 00
000C0300 23 20 9F 95 0C C1 C0 64 CC C9 FD 26 AA 92 3E 05
000C0310 F4 7B 7A 48 9B 54 23 97 55 57 2F D2 BF 65 7A 00
000C0320 38 D9 7E 08 03 B9 7E D5 D8 79 6C ED DD EC DF BD
000C0330 52 3F 6A E3 64 84 1A 9C B4 85 ED C0 1D 8A 62 BA
000C0340 12 00 00 00 19 28 02 00 13 00 00 00 2C 28 02 00
000C0350 15 00 00 00 41 28 02 00 15 00 00 00 56 28 02 00
000C0360 16 00 00 00 6C 28 02 00 0E 00 00 00 7A 28 02 00
As mentioned earlier, the file has periodic blocks of 0x40 encrypted bytes. The region 0xC0300-0xC033F above seems to be encrypted,
and should change if we can force it to be decrypted. We select this region as the
first test area and store the original bytes for this region in an array:
var testArea = 0xC0300;
var testLength = 0x40;
var originalEncryptedBytes = encryptedData.Skip(testArea).Take(testLength).ToArray();
Right before the code which saves the file, we insert a test:
// Check known encrypted area
if (decryptedData.Skip(testArea).Take(testLength).SequenceEqual(originalEncryptedBytes)) {
Console.WriteLine("No change to known encrypted area");
return;
}
This simply compares the original data with the data copied from the result pointer and exits the process if they are the same
(the return statement will cause the finally block to execute to free the DLL, then return control to the Powershell host script).
As we just discussed, it’s possible we got back a big ol’ empty block of zeroes, so we also filter out these results:
if (decryptedData.Skip(testArea).Take(testLength).Max() == 0) {
Console.WriteLine("Encrypted area is now all zeroes");
return;
}
If the maximum byte value of the region is zero, then all the bytes are zero (you can also use .All(x => x == 0) if you prefer).
This deals with the no change and all-zero scenarios, but it’s all possible we get junk back. Here we do the opposite:
take a region near the start of the file which should not be changed (in this case the first 0x40 bytes are encrypted and
the next 0x40 bytes are unencrypted), and test it to make sure that it has not changed:
var originalUnencryptedBytes = encryptedData.Skip(testLength).Take(testLength).ToArray();
// ...
if (!decryptedData.Skip(testLength).Take(testLength).SequenceEqual(originalUnencryptedBytes)) {
Console.WriteLine("Area changed that should be left the same");
return;
}
When we make these changes and re-run the scan using just the functions that generated file output last time,
the overall output resembles this snippet:
...
Trying 0016ed40
No change to known encrypted area
Trying 0016ed90
No change to known encrypted area
Trying 0016f470
Area changed that should be left the same
Trying 0016f4c0
No change to known encrypted area
Trying 0016f5c0
Encrypted area is now all zeroes
Trying 0016f6a0
Encrypted area is now all zeroes
Trying 0016f810
No change to known encrypted area
Trying 001a52e0
No change to known encrypted area
Trying 001a7010
Saving...
Trying 001bce80
No change to known encrypted area
Trying 001c1860
No change to known encrypted area
Trying 001d6550
...
Tip: If you want to tighten the filter then re-run the scan using only the subset of candidate
addresses which already produced file outputs in the last run, you can generate
a new address range list based on the generated files like this:
ls possibly-decrypted-*.dat | % { echo ([int] ("0x" + $_.Name.Substring(19,8))
0x180000000).ToString('X16') } > function-list-new.txt
This enumerates all of the generated files, extracts only the offset portion of each filename,
converts the hex strings to integers, adds the image preferred base address to each, converts
them back into strings and outputs the new set of addresses as a list to the specified file.
How many output files do we get now?
possibly-decrypted-001a7010.dat
One.
Let’s take a look inside at the same region as before:
000C02D0 16 00 00 00 5B 27 02 00 10 00 00 00 6B 27 02 00
000C02E0 11 00 00 00 7C 27 02 00 0E 00 00 00 8A 27 02 00
000C02F0 12 00 00 00 9C 27 02 00 0E 00 00 00 AA 27 02 00
000C0300 A5 2F 42 30 D5 23 B2 9C 94 2A C0 BA B5 98 A7 68
000C0310 0C 00 00 00 C7 27 02 00 0D 00 00 00 D4 27 02 00
000C0320 10 00 00 00 E4 27 02 00 0D 00 00 00 F1 27 02 00
000C0330 09 00 00 00 FA 27 02 00 0D 00 00 00 07 28 02 00
000C0340 12 00 00 00 19 28 02 00 13 00 00 00 2C 28 02 00
000C0350 15 00 00 00 41 28 02 00 15 00 00 00 56 28 02 00
000C0360 16 00 00 00 6C 28 02 00 0E 00 00 00 7A 28 02 00
Smokin’! We decrypted the data – or at least most of it. The 16 bytes at 0xC0300 are still messed up,
but that is a complication unique to Genshin Impact; the main point is we have proven that the correct function has been found –
and that is how you brute-force a function address with PowerShell. Nicely done!