Host in the shell
We can now construct a very simple PowerShell script to call this process in a loop with every possible address, as follows:
for ($a = 0x180000000; $a -le 0x1FFFFFFFF; $a++)
{
$aHex = $a.ToString('x')
Start-Process -FilePath ./FunctionSearch.exe -ArgumentList "$aHex" -NoNewWindow
}
The first line of the loop converts the loop value into a hexadecimal string. The second line creates a new process for our C# code,
passing in the address argument. The -NoNewWindow switch redirects output to the existing PowerShell console rather than opening a new window for each new process.
This script does not wait for the process to end before starting another one, so it will essentially queue them up as fast as your hardware allows.
The output looks like this:
Trying 00000000
Exception thrown - AccessViolationException - Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
Trying 00000001
Exception thrown - AccessViolationException - Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
Trying 00000002
Exception thrown - AccessViolationException - Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
Trying 00000003
Exception thrown - AccessViolationException - Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
Trying 00000004
Exception thrown - AccessViolationException - Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
The client process crashes most of the time, but the host script continues to start new processes with new address arguments without any problems.
>>The Need For Speed
The image file I’m attacking here is 32.3MB – 0x205B870 possible addresses. Execution time per call will vary,
but on my 24-core machine – and with the files stored on an SSD – this script iterates through about 0x200 (512)
functions per minute and adds 7% to the CPU consumption. That means it will take 46 days to test every address –
just a whisker shy of how long it takes to count votes in Pennsylvania. Statistics dictate that on average you’ll
have to test half of the addresses in any given image to find the correct one, so we’re looking at 23 days on average
per similar DLL we have to iterate through. It’s fair to say we are travelling at less than 50mph.
We can improve the search efficiency by pruning the search space.
We know that the majority of addresses are mid-function or mid-instruction,
and a good portion will not even be in the code section(s) of the image.
If IDA or your disassembler of choice is able to find the start of every function
(assuming the assembly-level obfuscation is not to a level that completely destroys its ability to
sweep through the code – if it is you’re out of luck and will have to skip to the next section),
we can dramatically reduce the number of addresses we have to test.
The plan is to build a text file containing a list of potential addresses to scan,
and modify our script to read lines from this file instead of using a numeric for loop.
Here is how to do this with IDA, but you can achieve similar results with other disassemblers:
- Load the image into IDA
- Wait for a really long time. Hopefully not longer than 46 days, but you never know with IDA. IDA will populate the Functions window (Shift+F3)
with most of the functions it finds within a few minutes, but for an absolutely complete list, you may need to wait several hours
- Right-click in the Functions window and choose Copy all, or just press Ctrl+Shift+Ins (because IDA loves keyboard shortcuts that make sense)
- Create a new document in a text editor that supports regular expression replacement and paste in the function list
(Visual Studio Code: Ctrl+N, Ctrl+V). You will now have text that looks like this:
Function name Segment Start Length Locals Arguments R F L S B T =
sub_180001000 .text 0000000180001000 00000003 00000000 00000000 R . . . . . .
sub_180001020 .text 0000000180001020 00000004 00000000 00000000 R . . . . . .
sub_180001030 .text 0000000180001030 000000BB 00000018 00000038 R . . . B . .
sub_1800010EB .text 00000001800010EB 000000A9 00000018 00000040 R . . . B . .
sub_1800012CC .text 00000001800012CC 00000138 00000018 0000002C R . . . B . .
sub_180001404 .text 0000000180001404 000000D9 00000018 0000002C R . . . B . .
- Delete the first line
- Replace all using the regex search term ^(.*?)\t(.*?)\t([0-9A-Fa-f]{16}).*$
and the replacement term $3. This regex isolates the address as the 3rd match group
and discards everything else on the line (Visual Studio Code: Ctrl+H, Alt+R, enter regex,
Tab, enter replacement, Ctrl+Alt+Enter). The file will now look like this:
0000000180001000
0000000180001020
0000000180001030
00000001800010EB
00000001800012CC
0000000180001404
- Save it as function-list.txt in the same folder as the script
We will now modify the script we created above to read in each line of the file sequentially and pass it as an argument to the client process:
$functionListFile = "function-list.txt"
foreach ($line in Get-Content $functionListFile)
{
Start-Process -FilePath ./FunctionSearch.exe -ArgumentList "$line" -NoNewWindow
}
The output now looks like this:
Trying 00001030
Trying 00001020
Trying 000010eb
Trying 00001000
Exception thrown - AccessViolationException - Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
Exception thrown - AccessViolationException - Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
Exception thrown - AccessViolationException - Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
Trying 0000167e
Trying 000012cc
Trying 00001404
Trying 000014e0
Exception thrown - AccessViolationException - Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
Exception thrown - AccessViolationException - Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
(note that output and error messages are displayed out of order since multiple client processes can be running
simultaneously and at different stages of execution, and all are sharing the same console)
For this example, this narrows the search space from almost 34 million addresses to a scant 77,033 –
a handy reduction by a factor of 50. Execution time goes up, however, as more actually valid code is executed.
This time my machine chewed through 350 per minute for an estimated total runtime of 220 minutes, or about 3.6 hours – not great, not terrible.