Stop! In The Name of Mov
There is a problem with our script as it stands now: sometimes the client process will hang as the function
call results in an infinite loop or fails to acquire a spinlock or mutex. This will eventually fill up the available memory with hung processes,
and possibly bring down the entire machine. We can mitigate this by allowing the process a certain amount of time to execute, then killing
it if the timeout expires. The following updated script accomplishes this:
$functionListFile = "function-list.txt"
foreach ($line in Get-Content $functionListFile)
{
$proc = Start-Process -FilePath ./FunctionSearch.exe -ArgumentList "$line" -PassThru -NoNewWindow
$timeout = $null
$proc | Wait-Process -Timeout 10 -ErrorAction SilentlyContinue -ErrorVariable timeout
if ($timeout) {
echo "Timeout - killing process"
$proc | kill
}
}
There are a few points of interest here. Normally, Start-Process does not return anything. By adding the -PassThru switch,
a Process object will be returned, which we store in $proc and can use to query and control the process after it starts (line 5).
Line 8 pipes the stored Process object to Wait-Process, which waits for a process to exit. By specifying the -Timeout option,
we instruct the cmdlet to wait for a maximum number of seconds, after which the script will
continue executing regardless of whether the client process has exited or not.
If a timeout has occurred, the variable whose name we pass to Wait-Process in the -ErrorVariable action will be non-null.
We check for this on line 10, and kill the process on line 12 in the event of a timeout.
This is not a very clean or elegant way of ending a process, but it works for our purposes.
This solution guarantees that the script won’t get stuck or exhaust the available hardware resources,
but it also has the side effect of only allowing one client process to run at a time, since we wait to
see whether it times out or not before starting the next one. This is actually good for accurate logging
to understand what’s happening on each call, but bad for search efficiency. Now we’re only hitting 44 calls per minute,
for a total maximum runtime of 29 hours. Ouch.
50mph is not going to be enough. We need to get all the way to 88mph.
>>Where we’re going, we don’t need code
Even on a low-end PC, we won’t be coming close to saturating the available compute resources with our current solution.
Unless the storage media is a bottleneck, we can improve performance by splitting the work up into threads to increase CPU utilization.
Tasks like this where each test can run independently are an excellent candidate for parallelization,
as there are no synchronization considerations to worry about.
In this demo, we’re actually going to spin up a large number of PowerShell instances,
so we will be splitting the work across processes rather than threads. Creating a process is –
relatively speaking – computationally expensive, but we will only perform this spin-up at the beginning of the search so
it’s not a big deal in this case. The plan is to run multiple instances of the script we’ve already created in parallel,
in separate processes, and give each instance a different set of addresses to scan.
First, we need to split the function address list up into chunks, one chunk for each host process.
If you skipped the previous section because your target code is obfuscated in a way that prevents
you from deriving the function addresses, you can instead split up the entire address range into chunks instead.
Tip: If you don’t have a list of function address candidates and need to scan every address,
just use the range in the .text section and any other code/executable sections. The .data and related sections are unlikely to contain code.
In the event you don’t find anything after scanning the code address ranges, you can try to search the data address ranges as a last resort.
We can sit in our text editor and copy paste all day, or we can write another PowerShell script to do it for us.
This is especially useful if we decide to change the chunk size (to have more or less processes running at once) or
if we need to regenerate the address range from scratch and split it up again.
Here is the script:
param (
[string] $fileString,
[int] $LineNumberToSplitOn
)
$file = [System.IO.FileInfo] (Resolve-Path $fileString).Path
$DirectoryContainingFiles = $file.Directory
$FileContent = Get-Content $(Get-Item $file.FullName)
$LineCount = $FileContent.Count
$TotalNumberOfSplitFiles = [math]::ceiling($($LineCount / $LineNumberToSplitOn))
if ($TotalNumberOfSplitFiles -gt 1) {
for ($i=1; $i -lt $($TotalNumberOfSplitFiles+1); $i++) {
$StartingLine = $LineNumberToSplitOn * $($i-1)
if ($LineCount -lt $($LineNumberToSplitOn * $i)) {
$EndingLine = $LineCount
}
if ($LineCount -gt $($LineNumberToSplitOn * $i)) {
$EndingLine = $LineNumberToSplitOn * $i
}
New-Variable -Name "$($file.BaseName)_Part$i" -Value $(
$FileContent[$StartingLine..$EndingLine]
) -Force
$(Get-Variable -Name "$($file.BaseName)_Part$i" -ValueOnly) | Out-File "$DirectoryContainingFiles\$($file.BaseName)_Part$i$($file.Extension)"
}
}
The details are rather boring, but it essentially takes two arguments: the text file and the line number multiple to split on,
calculates the number of files needed, then writes them all with “_PartX” appended.
Usage:
./split-text.ps1 function-list.txt 1500
This will split the list into chunks of 1500 addresses:
Mode LastWriteTime Length Name
---- ------------- ------ ----
-a---- 24/01/2021 05:43 54038 function-list_Part1.txt
-a---- 24/01/2021 05:43 54038 function-list_Part10.txt
-a---- 24/01/2021 05:43 54038 function-list_Part11.txt
-a---- 24/01/2021 05:43 54038 function-list_Part12.txt
-a---- 24/01/2021 05:43 54038 function-list_Part13.txt
-a---- 24/01/2021 05:43 54038 function-list_Part14.txt
...
We replace the hard-coded $functionListFile in our previous script with a command-line argument
so that we can specify a different address range file for each instance:
param (
[string] $functionListFile
)
foreach ($line in Get-Content $functionListFile)
{
$proc = Start-Process -FilePath ./FunctionSearch.exe -ArgumentList "$line" -PassThru -NoNewWindow
$timeout = $null
$proc | Wait-Process -Timeout 10 -ErrorAction SilentlyContinue -ErrorVariable timeout
if ($timeout) {
echo "Timeout - killing process"
$proc | kill
}
}
Finally, we need a startup script to launch an instance of the search script for each of the search ranges (text files):
for ($i = 1; $i -le 52; $i++) {
Start-Process -FilePath "powershell.exe" -ArgumentList "-Command ./search-script.ps1 function-list_Part$i.txt"
}
(in this example, splitting the list up into 1500-item chunks generated 52 files – change as appropriate for your application)
This script starts one instance for each address range file without waiting for any of them to finish, so they all run simultaneously.
This is going to spew a massive amount of PowerShell windows onto your screen. Don’t be afraid!
You should also see close to 100% CPU utilization. If you don’t, split the function list into smaller chunks and start more instances.
If your PC falls over into a crumpled heap,
split the function list into larger chunks and start less instances! One instance per CPU hardware thread is a good starting baseline.
The work will not be distributed 100% evenly because some functions will take longer to execute than others,
and there will also be timeouts. Rather elegantly, each PowerShell window will close once its search range completes,
so you can just keep an eye on this to track the progress.
You will likely see various error dialogs as the scan runs:
Don’t worry about these. There is no need to close them: our script will kill the process after 10 seconds anyway.
Watching the scan can be highly informative about other functions in the application. Here is a sample of messages that came up during the run for our example DLL:
Trying 00c20d90
open_socket: 't get ©4ùd▓ host entry
vrpn_connect_udp_port: error finding host by name (¥.é┬ëïR╠└ß=═þ─Æjô┌ÀdkÅ┼mÆX(\ :ö╣D╣ë4).
vrpn_udp_request_lob_packet: send() failed: No error
Trying 00c96d40
SetPauseAnimatorBelowLodLevel receive a level 38512456 which is more than 255
(Filename: Line: 770)
Note: This can be improved quite a bit further.
You can create a pool of client processes per host process without waiting for them to finish,
then each time a process in the pool completes or times out, you create a new client process with the next address to test.
This will allow the host script to cycle through other client processes while one of them is hung.
You make the startup script count the number of files matching a filter like function-list-Part*.txt so that the loop counter upper bound is not hard-coded.
You can alter the text file split script to specify how many chunks you want instead of what line to split on.
You can also dispense with splitting the file altogether and simply alter the host script to take the starting line number and a count as input,
and change the startup script to use a numeric for loop with the count per host script as the stepping factor. This is actually a subjectively better solution,
but I didn’t have time to rework the script in time for this article unfortunately.