Flipping the coin
It takes two to tango. Let’s load up UnityPlayer.dll
and find out where it calls this silly function.
Needless to say, the DLL doesn’t conveniently define this as an import, but the function name must be referenced somewhere, so we perform a string search for il2cpp_thread_get_name
. This is a few clicks of work and IDA has already labelled the string address as aIl2cppThreadGe
for us (a
stands for ASCII).
There is only one reference to this:
v179 = 0i64;
v181 = 0i64;
v182 = 68;
LOBYTE(v180) = 0;
sub_7FFF4E399380(&v179, "il2cpp_thread_get_name", 0x16ui64);
v152 = sub_7FFF4E7B30C0(qword_7FFF4F629F20, &v179, 0);
qword_7FFF4F629E48 = v152;
if ( v179 && v180 > 0 )
{
sub_7FFF4E51A4C0(v179, v182);
v152 = qword_7FFF4F629E48;
}
if ( !v152 )
{
v2 = 0;
sub_7FFF4E6E0A50("il2cpp: function il2cpp_thread_get_name not found\n");
}
This is part of a much longer function where this same pattern repeats dozens of times, one for each IL2CPP API. You don’t really need to screw around with all these function calls to get the gist of what is going on here: the string name is copied to v179
(line 5), the symbol is resolved to a code pointer and stored in v152
(line 6), then stored at qword_7FFF4F629E48
(line 7). We infer the meaning of v152
by looking at lines 13-17 which throw an error if v152
is zero (null). We infer the meaning of v179
by noting that the function in line 5 receives it by reference, the string literal pointer and the length of the literal – meaning it is probably a memcpy
-type function – then the result is passed again by reference to the symbol resolver function in line 6.
We rename qword_7FFF4F629E48
to pil2cpp_thread_get_name
and search for references to it. This time we are looking for where it is called. Again, there is only one location:
char __fastcall sub_7FFF4E816170(__int64 a1, __int64 a2, unsigned int a3, __int64 a4)
{
v10 = 99999979;
v7 = sub_7FFF4EB67110;
v8 = sub_7FFF4EB67130;
v4 = a4;
v9 = sub_7FFF4EB67120;
v5 = a3;
pil2cpp_thread_get_name(&v7, &v10);
sub_7FFF4E7E3A30();
qword_7FFF4F629E00(0i64);
qword_7FFF4F6299D0(v5, v4, 0i64);
qword_7FFF4F6299B8();
qword_7FFF4F6299C0();
qword_7FFF4F6299A0("IL2CPP Root Domain");
qword_7FFF4F6299E8("unused_application_configuration");
sub_7FFF4E81A540();
return 1;
}
We have a little giggle as we note that v10
is set to 99999979 to “authenticate” the call to il2cpp_thread_get_name
, but it’s not actually clear from the decompilation how the three function pointers at v7
, v8
and v9
are ordered in memory, or even if anything besides v7
is passed to the export, so we switch to the disassembly view:
.text:00007FFF4E816185 lea rax, sub_7FFF4EB67110
.text:00007FFF4E81618C mov [rsp+48h+arg_0], 5F5E0EBh
.text:00007FFF4E816194 mov [r11-28h], rax
.text:00007FFF4E816198 mov rsi, rdx
.text:00007FFF4E81619B lea rax, sub_7FFF4EB67130
.text:00007FFF4E8161A2 mov r14, rcx
.text:00007FFF4E8161A5 mov [r11-20h], rax
.text:00007FFF4E8161A9 lea rdx, [r11+8] ; _QWORD
.text:00007FFF4E8161AD lea rax, sub_7FFF4EB67120
.text:00007FFF4E8161B4 mov rbx, r9
.text:00007FFF4E8161B7 lea rcx, [r11-28h] ; _QWORD
.text:00007FFF4E8161BB mov [r11-18h], rax
.text:00007FFF4E8161BF mov edi, r8d
.text:00007FFF4E8161C2 call cs:pil2cpp_thread_get_name
The x64 calling convention dictates that the first argument is passed in rcx
, which we can see in line 11 is set to the address r11-28h
. Lines 3, 7 and 12 store the three function pointers loaded in lines 1, 5 and 9 at r11-28h
, r11-20h
and r11-18h
respectively. These are 64-bit pointers each 64 bits apart, so we fill in 24 bytes of consecutive pointer data, as expected.
This means the pointers are ordered in memory thus: sub_7FFF4EB67110
, sub_7FFF4EB7130
, sub_7FFF4EB7120
. By correlating these with our reverse engineered il2cpp_thread_get_name
function, we can discern their purposes: unityplayer.DecryptMetadata
, pGetStringFromIndex
, qword_7FFF43D74F90
. If we actually click on the first function, we get:
.text:00007FFF4EB67110 sub_7FFF4EB67110
.text:00007FFF4EB67110 jmp DecryptMetadata
.text:00007FFF4EB67110 sub_7FFF4EB67110 endp
where we defined DecryptMetadata
in part 1 of this series. Excellent! We now know where GetStringFromIndex
is in UnityPlayer.dll
, and we can construct a script to call it in isolation similarly to how we called DecryptMetadata
. In C#, this looks as follows:
[UnmanagedFunctionPointer(CallingConvention.Cdecl)]
private delegate IntPtr GetStringFromIndex(byte[] bytes, uint index);
// ...
var pGetStringFromIndex = (GetStringFromIndex) Marshal.GetDelegateForFunctionPointer(ModuleBase + 0x8E7130, typeof(GetStringFromIndex));
var stringIndex = 1234;
var decryptedString = Marshal.PtrToStringAnsi(pGetStringFromIndex(decryptedMetadata, (uint) index));
(ModuleBase
and decryptedMetadata
are defined in the script from part 1)
We calculate the offset of the string function from the module base by simply subtracting the base – 0x7FFF4E280000
– from the address of the function – 0x7FFF4EB67130
– to get 0x8E7130
. The function Marshal.PtrToStringAnsi
moves a null-terminated string from an unmanaged pointer address into a managed string
object.
Info: Ultimately, the entire string table can be found unencrypted starting at
0xD5F888
in this global-metadata.dat
file. The GetStringFromIndex
function is heavily
obfuscated with control flow flattening, but appears to be a “do nothing” decoy function besides knowing the start offset of
the string table, which is not actually present in the metadata header in plaintext form. I dumped every string using this
function and ran a binary diff with the string table cut and paste from global-metadata.dat
and the results were identical.
This may not necessarily be the case for future versions, of course.