IL2CPP Internals:

Il2CPP Reverse:

Tutorial:

Adventures:

Honkai Impact:

Example 3: No global-metadata.dat file (Guardian Tales / 가디언-테일즈)

This one got dumped on my desk a few days ago and has no global-metadata.dat file at all! The file must be stored somewhere else or embedded in the binary – an increasingly common practice to defeat automated reverse engineering.

By now, we know the drill: standard operating procedure, trace the path to MetadataCache::Initialize:

result = vm::MetadataLoader::LoadMetadataFile();
s_GlobalMetadata = (__int64)result;
if ( result )
{
  qword_6AF08E8 = (__int64)result;
  qword_6AF08F0 = sub_1B1E63C(*(int *)(qword_6AF08D0 + 48), 8LL);
  qword_6AF08F8 = sub_1B1E63C(*(int *)(qword_6AF08E8 + 164) / 0x5CuLL, 8LL);
  qword_6AF0900 = sub_1B1E63C((unsigned __int64)*(int *)(qword_6AF08E8 + 52) >> 5, 8LL);
  qword_6AF0908 = sub_1B1E63C(*(int *)(qword_6AF08D0 + 64), 8LL);

This seems normal except that no filename is passed to MetadataLoader::LoadMetadataFile. We drill down to the function:

_DWORD *vm::MetadataLoader::LoadMetadataFile()
{
  sub_1B1EF70((__int64)unkStruct);
  folder_and_error = "Resources";
  v20 = 9LL;
  v0 = *(_QWORD *)(unkStruct[0] - 24);
  *(_QWORD *)v18 = unkStruct[0];
  *(_QWORD *)&v18[8] = v0;
  il2cpp::utils::PathUtils::Combine(v18, &folder_and_error, &resourcesDirectory);
  sub_1B5D78C(unkStruct);
  *(_OWORD *)v18 = xmmword_52725E6;
  v18[0] = 109;
  

v1 = 1LL; *(_OWORD *)&v18[11] = *(__int128 *)((char *)&xmmword_52725E6 + 11); do v18[v1++] ^= 0xFEu; while ( v1 != 26 ); unkStruct[0] = (__int64)v18;

unkStruct[1] = 26LL; v2 = *((_QWORD *)resourcesDirectory - 3); folder_and_error = resourcesDirectory; v20 = v2;

il2cpp::utils::PathUtils::Combine(&folder_and_error, unkStruct, &resourceFilePath);

LODWORD(folder_and_error) = 0;

fileHandle = os::File::Open(&resourceFilePath, 3LL, 1LL, 1LL, 0LL, &folder_and_error);

fileHandle_1 = fileHandle; if ( (_DWORD)folder_and_error ) { utils::Logging::Write("ERROR: Could not open %s"); LABEL_7: v7 = 0LL; goto LABEL_8; } // ...

Again I’ve annotated the symbols according to the original source code. Note that I named folder_and_error this way because it serves two purposes – a pointer to a path string and a boolean error flag. This can occur as a result of compiler optimizations or incorrect decompilation.

It does appear at first glance that a file is opened from storage (lines 22 and 24). But which file?

We don’t need to understand all of this code’s precise functionality to work this out. We can deduce that unkStruct is probably some kind of struct since the decompiler indexes it like an array but the stored values don’t appear to be of the same type, so we rename it accordingly. A number of functions receive this as an argument, and we note that it is ultimately passed to PathUtils::Combine which means that the first entry ultimately points to a filename or pathname (line 22).

Lines 12-17 seem to perform some kind of trivial XOR decryption – a loop which XORs each byte in v18 with 0xFE – and we might deduce from this that the filename length is 26 characters due to the number of iterations of the while loop (line 16) combined with the fact the final result (v18) is stored as a pointer in the first entry of unkStruct (line 17).

Let’s rename v18 to filename and undefine the awkward xmmword_52725E6 so that IDA interprets it as a sequence of bytes instead, then decompile again:

sub_1B1EF70((__int64)unkStruct);
folder_and_error = "Resources";
v20 = 9LL;
v0 = *(_QWORD *)(unkStruct[0] - 24);
*(_QWORD *)filename = unkStruct[0];
*(_QWORD *)&filename[8] = v0;
il2cpp::utils::PathUtils::Combine(filename, &folder_and_error, &resourcesDirectory);
sub_1B5D78C(unkStruct);
*(_OWORD *)filename = unk_52725E6;
filename[0] = 0x6D;
v1 = 1LL;
*(_OWORD *)&filename[11] = unk_52725F1;
do
  filename[v1++] ^= 0xFEu;
while ( v1 != 26 );
unkStruct[0] = (__int64)filename;
unkStruct[1] = 26LL;

Lines 16-17 tell us that unkStruct is probably a two-element struct where the first element is a pointer to the filename and the second element is the filename length.

The rest of the code constructs the encrypted filename before running the decryption function on lines 11-15. Let’s reconstruct it:

  • Byte zero (the first character) is ASCII code 0x6D or m (line 10); the loop counter starts at 1 – not zero (line 12) so this character is not encrypted
  • Bytes 1-10 are set in line 9 to whatever unk_52725E6 is – this is an _OWORD assignment so 16 bytes are copied, but some are later overwritten including byte 0 as above. These same bytes are also set in lines 4-6 to whatever sub_1B1EF70 populates bytes 1-7 and bytes -24 – -17 of unkStruct with on line 1, but are completely discarded without being used by this overwriting _OWORD assignment. Bytes 11-15 subsequently get overwritten on line 12 (see below).
  • Bytes 11-25 are then set in line 12 (replacing bytes 11-15 in the assignment on line 9) to whatever unk_52725F1 is

This mess is a bit of a decompilation quirk – unk_52725E6 and unk_52725F1 are right next to each other in memory, so essentially all this code does is copy 26 bytes from unk_52725E6 into filename, overwrite the first character with m and then XOR all the rest with 0xFE:

.rodata:00000000052725C6 aFileloadexcept DCB "FileLoadException",0
.rodata:00000000052725D8 aModule         DCB "<Module>",0
.rodata:00000000052725E1 aNull_0         DCB "NULL",0
.rodata:00000000052725E6 unk_52725E6     DCB 0x93
.rodata:00000000052725E7                 DCB 0x8D
.rodata:00000000052725E8                 DCB 0x9D
.rodata:00000000052725E9                 DCB 0x91
.rodata:00000000052725EA                 DCB 0x8C
.rodata:00000000052725EB                 DCB 0x92
.rodata:00000000052725EC                 DCB 0x97
.rodata:00000000052725ED                 DCB 0x9C
.rodata:00000000052725EE                 DCB 0xD0
.rodata:00000000052725EF                 DCB 0x9A
.rodata:00000000052725F0                 DCB 0x92
.rodata:00000000052725F1 unk_52725F1     DCB 0x92
.rodata:00000000052725F2                 DCB 0xD3
.rodata:00000000052725F3                 DCB 0x8C
.rodata:00000000052725F4                 DCB 0x9B
.rodata:00000000052725F5                 DCB 0x8D
.rodata:00000000052725F6                 DCB 0x91
.rodata:00000000052725F7                 DCB 0x8B
.rodata:00000000052725F8                 DCB 0x8C
.rodata:00000000052725F9                 DCB 0x9D
.rodata:00000000052725FA                 DCB 0x9B
.rodata:00000000052725FB                 DCB 0x8D
.rodata:00000000052725FC                 DCB 0xD0
.rodata:00000000052725FD                 DCB 0x9A
.rodata:00000000052725FE                 DCB 0x9F
.rodata:00000000052725FF                 DCB 0x8A
.rodata:0000000005272600                 DCB    0
.rodata:0000000005272601 aErrorCouldNotO DCB "ERROR: Could not open %s",0
.rodata:000000000527261A aErrorCouldNotG DCB "ERROR: Could not get length %s",0
.rodata:0000000005272639 aErrorCouldNotA DCB "ERROR: Could not alloc memory size %ld",0

Notice how there are a bunch of unencrypted strings on either side….

alt

What do we get when we decrypt this string?

mscorlib.dll-resources.dat

Well, this file is certainly present but it doesn’t resemble a global-metadata.dat file. Is it an encrypted file? Is it combined with another file? Is it all a big lie and the metadata is embedded? That is for you to discover – once again, the idea here is to show how to get a foot – or maybe just a toe – in the door!