Flipping the coin

It takes two to tango. Let’s load up UnityPlayer.dll and find out where it calls this silly function.

Needless to say, the DLL doesn’t conveniently define this as an import, but the function name must be referenced somewhere, so we perform a string search for il2cpp_thread_get_name. This is a few clicks of work and IDA has already labelled the string address as aIl2cppThreadGe for us (a stands for ASCII).

There is only one reference to this:

v179 = 0i64;
v181 = 0i64;
v182 = 68;
LOBYTE(v180) = 0;
sub_7FFF4E399380(&v179, "il2cpp_thread_get_name", 0x16ui64);
v152 = sub_7FFF4E7B30C0(qword_7FFF4F629F20, &v179, 0);
qword_7FFF4F629E48 = v152;
if ( v179 && v180 > 0 )
{
  sub_7FFF4E51A4C0(v179, v182);
  v152 = qword_7FFF4F629E48;
}
if ( !v152 )
{
  v2 = 0;
  sub_7FFF4E6E0A50("il2cpp: function il2cpp_thread_get_name not found\n");
}

This is part of a much longer function where this same pattern repeats dozens of times, one for each IL2CPP API. You don’t really need to screw around with all these function calls to get the gist of what is going on here: the string name is copied to v179 (line 5), the symbol is resolved to a code pointer and stored in v152 (line 6), then stored at qword_7FFF4F629E48 (line 7). We infer the meaning of v152 by looking at lines 13-17 which throw an error if v152 is zero (null). We infer the meaning of v179 by noting that the function in line 5 receives it by reference, the string literal pointer and the length of the literal – meaning it is probably a memcpy-type function – then the result is passed again by reference to the symbol resolver function in line 6.

We rename qword_7FFF4F629E48 to pil2cpp_thread_get_name and search for references to it. This time we are looking for where it is called. Again, there is only one location:

char __fastcall sub_7FFF4E816170(__int64 a1, __int64 a2, unsigned int a3, __int64 a4)
{
  v10 = 99999979;
  v7 = sub_7FFF4EB67110;
  v8 = sub_7FFF4EB67130;
  v4 = a4;
  v9 = sub_7FFF4EB67120;
  v5 = a3;
  pil2cpp_thread_get_name(&v7, &v10);
  sub_7FFF4E7E3A30();
  qword_7FFF4F629E00(0i64);
  qword_7FFF4F6299D0(v5, v4, 0i64);
  qword_7FFF4F6299B8();
  qword_7FFF4F6299C0();
  qword_7FFF4F6299A0("IL2CPP Root Domain");
  qword_7FFF4F6299E8("unused_application_configuration");
  sub_7FFF4E81A540();
  return 1;
}

We have a little giggle as we note that v10 is set to 99999979 to “authenticate” the call to il2cpp_thread_get_name, but it’s not actually clear from the decompilation how the three function pointers at v7, v8 and v9 are ordered in memory, or even if anything besides v7 is passed to the export, so we switch to the disassembly view:

.text:00007FFF4E816185                 lea     rax, sub_7FFF4EB67110
.text:00007FFF4E81618C                 mov     [rsp+48h+arg_0], 5F5E0EBh
.text:00007FFF4E816194                 mov     [r11-28h], rax
.text:00007FFF4E816198                 mov     rsi, rdx
.text:00007FFF4E81619B                 lea     rax, sub_7FFF4EB67130
.text:00007FFF4E8161A2                 mov     r14, rcx
.text:00007FFF4E8161A5                 mov     [r11-20h], rax
.text:00007FFF4E8161A9                 lea     rdx, [r11+8]    ; _QWORD
.text:00007FFF4E8161AD                 lea     rax, sub_7FFF4EB67120
.text:00007FFF4E8161B4                 mov     rbx, r9
.text:00007FFF4E8161B7                 lea     rcx, [r11-28h]  ; _QWORD
.text:00007FFF4E8161BB                 mov     [r11-18h], rax
.text:00007FFF4E8161BF                 mov     edi, r8d
.text:00007FFF4E8161C2                 call    cs:pil2cpp_thread_get_name

The x64 calling convention dictates that the first argument is passed in rcx, which we can see in line 11 is set to the address r11-28h. Lines 3, 7 and 12 store the three function pointers loaded in lines 1, 5 and 9 at r11-28h, r11-20h and r11-18h respectively. These are 64-bit pointers each 64 bits apart, so we fill in 24 bytes of consecutive pointer data, as expected.

This means the pointers are ordered in memory thus: sub_7FFF4EB67110, sub_7FFF4EB7130, sub_7FFF4EB7120. By correlating these with our reverse engineered il2cpp_thread_get_name function, we can discern their purposes: unityplayer.DecryptMetadata, pGetStringFromIndex, qword_7FFF43D74F90. If we actually click on the first function, we get:

.text:00007FFF4EB67110 sub_7FFF4EB67110
.text:00007FFF4EB67110                 jmp     DecryptMetadata
.text:00007FFF4EB67110 sub_7FFF4EB67110 endp

where we defined DecryptMetadata in part 1 of this series. Excellent! We now know where GetStringFromIndex is in UnityPlayer.dll, and we can construct a script to call it in isolation similarly to how we called DecryptMetadata. In C#, this looks as follows:

[UnmanagedFunctionPointer(CallingConvention.Cdecl)]
private delegate IntPtr GetStringFromIndex(byte[] bytes, uint index);
 
// ...
 
var pGetStringFromIndex = (GetStringFromIndex) Marshal.GetDelegateForFunctionPointer(ModuleBase + 0x8E7130, typeof(GetStringFromIndex));
 
var stringIndex = 1234;
var decryptedString = Marshal.PtrToStringAnsi(pGetStringFromIndex(decryptedMetadata, (uint) index));

(ModuleBase and decryptedMetadata are defined in the script from part 1)

We calculate the offset of the string function from the module base by simply subtracting the base – 0x7FFF4E280000 – from the address of the function – 0x7FFF4EB67130 – to get 0x8E7130. The function Marshal.PtrToStringAnsi moves a null-terminated string from an unmanaged pointer address into a managed string object.