Stop! In The Name of Mov

There is a problem with our script as it stands now: sometimes the client process will hang as the function call results in an infinite loop or fails to acquire a spinlock or mutex. This will eventually fill up the available memory with hung processes, and possibly bring down the entire machine. We can mitigate this by allowing the process a certain amount of time to execute, then killing it if the timeout expires. The following updated script accomplishes this:

$functionListFile = "function-list.txt"
 
foreach ($line in Get-Content $functionListFile)
{
    $proc = Start-Process -FilePath ./FunctionSearch.exe -ArgumentList "$line" -PassThru -NoNewWindow
 
    $timeout = $null
    $proc | Wait-Process -Timeout 10 -ErrorAction SilentlyContinue -ErrorVariable timeout
 
    if ($timeout) {
        echo "Timeout - killing process"
        $proc | kill
    }
}

There are a few points of interest here. Normally, Start-Process does not return anything. By adding the -PassThru switch, a Process object will be returned, which we store in $proc and can use to query and control the process after it starts (line 5).

Line 8 pipes the stored Process object to Wait-Process, which waits for a process to exit. By specifying the -Timeout option, we instruct the cmdlet to wait for a maximum number of seconds, after which the script will continue executing regardless of whether the client process has exited or not.

If a timeout has occurred, the variable whose name we pass to Wait-Process in the -ErrorVariable action will be non-null. We check for this on line 10, and kill the process on line 12 in the event of a timeout. This is not a very clean or elegant way of ending a process, but it works for our purposes.

This solution guarantees that the script won’t get stuck or exhaust the available hardware resources, but it also has the side effect of only allowing one client process to run at a time, since we wait to see whether it times out or not before starting the next one. This is actually good for accurate logging to understand what’s happening on each call, but bad for search efficiency. Now we’re only hitting 44 calls per minute, for a total maximum runtime of 29 hours. Ouch.

50mph is not going to be enough. We need to get all the way to 88mph.

>>Where we’re going, we don’t need code

Even on a low-end PC, we won’t be coming close to saturating the available compute resources with our current solution. Unless the storage media is a bottleneck, we can improve performance by splitting the work up into threads to increase CPU utilization. Tasks like this where each test can run independently are an excellent candidate for parallelization, as there are no synchronization considerations to worry about.

In this demo, we’re actually going to spin up a large number of PowerShell instances, so we will be splitting the work across processes rather than threads. Creating a process is – relatively speaking – computationally expensive, but we will only perform this spin-up at the beginning of the search so it’s not a big deal in this case. The plan is to run multiple instances of the script we’ve already created in parallel, in separate processes, and give each instance a different set of addresses to scan.

First, we need to split the function address list up into chunks, one chunk for each host process. If you skipped the previous section because your target code is obfuscated in a way that prevents you from deriving the function addresses, you can instead split up the entire address range into chunks instead.

We can sit in our text editor and copy paste all day, or we can write another PowerShell script to do it for us. This is especially useful if we decide to change the chunk size (to have more or less processes running at once) or if we need to regenerate the address range from scratch and split it up again.

Here is the script:

param (
    [string] $fileString,
    [int] $LineNumberToSplitOn
)
 
$file = [System.IO.FileInfo] (Resolve-Path $fileString).Path
$DirectoryContainingFiles = $file.Directory
$FileContent = Get-Content $(Get-Item $file.FullName)
$LineCount = $FileContent.Count
$TotalNumberOfSplitFiles = [math]::ceiling($($LineCount / $LineNumberToSplitOn))
 
if ($TotalNumberOfSplitFiles -gt 1) {
    for ($i=1; $i -lt $($TotalNumberOfSplitFiles+1); $i++) {
        $StartingLine = $LineNumberToSplitOn * $($i-1)
        if ($LineCount -lt $($LineNumberToSplitOn * $i)) {
            $EndingLine = $LineCount
        }
        if ($LineCount -gt $($LineNumberToSplitOn * $i)) {
            $EndingLine = $LineNumberToSplitOn * $i
        }
 
        New-Variable -Name "$($file.BaseName)_Part$i" -Value $(
            $FileContent[$StartingLine..$EndingLine]
        ) -Force
 
        $(Get-Variable -Name "$($file.BaseName)_Part$i" -ValueOnly) | Out-File "$DirectoryContainingFiles\$($file.BaseName)_Part$i$($file.Extension)"
    }
}

The details are rather boring, but it essentially takes two arguments: the text file and the line number multiple to split on, calculates the number of files needed, then writes them all with “_PartX” appended.

Usage:

./split-text.ps1 function-list.txt 1500

This will split the list into chunks of 1500 addresses:

Mode                LastWriteTime         Length Name
----                -------------         ------ ----
-a----       24/01/2021     05:43          54038 function-list_Part1.txt
-a----       24/01/2021     05:43          54038 function-list_Part10.txt
-a----       24/01/2021     05:43          54038 function-list_Part11.txt
-a----       24/01/2021     05:43          54038 function-list_Part12.txt
-a----       24/01/2021     05:43          54038 function-list_Part13.txt
-a----       24/01/2021     05:43          54038 function-list_Part14.txt
...

We replace the hard-coded $functionListFile in our previous script with a command-line argument so that we can specify a different address range file for each instance:

param ( [string] $functionListFile )

foreach ($line in Get-Content $functionListFile) { $proc = Start-Process -FilePath ./FunctionSearch.exe -ArgumentList "$line" -PassThru -NoNewWindow $timeout = $null $proc | Wait-Process -Timeout 10 -ErrorAction SilentlyContinue -ErrorVariable timeout if ($timeout) { echo "Timeout - killing process" $proc | kill } }

Finally, we need a startup script to launch an instance of the search script for each of the search ranges (text files):

for ($i = 1; $i -le 52; $i++) {
    Start-Process -FilePath "powershell.exe" -ArgumentList "-Command ./search-script.ps1 function-list_Part$i.txt"
}

(in this example, splitting the list up into 1500-item chunks generated 52 files – change as appropriate for your application)

This script starts one instance for each address range file without waiting for any of them to finish, so they all run simultaneously. This is going to spew a massive amount of PowerShell windows onto your screen. Don’t be afraid!

alt

You should also see close to 100% CPU utilization. If you don’t, split the function list into smaller chunks and start more instances. If your PC falls over into a crumpled heap, split the function list into larger chunks and start less instances! One instance per CPU hardware thread is a good starting baseline.

The work will not be distributed 100% evenly because some functions will take longer to execute than others, and there will also be timeouts. Rather elegantly, each PowerShell window will close once its search range completes, so you can just keep an eye on this to track the progress.

You will likely see various error dialogs as the scan runs:

alt

Don’t worry about these. There is no need to close them: our script will kill the process after 10 seconds anyway.

Watching the scan can be highly informative about other functions in the application. Here is a sample of messages that came up during the run for our example DLL:

Trying 00c20d90
open_socket:  't get ©4ùd▓ host entry
vrpn_connect_udp_port: error finding host by name (¥.é┬ëïR╠└ß=═þ─Æjô┌ÀdkÅ┼mÆX(\ :ö╣D╣ë4).
vrpn_udp_request_lob_packet: send() failed: No error

Trying 00c96d40
SetPauseAnimatorBelowLodLevel receive a level 38512456 which is more than 255

(Filename: Line: 770)