Surveying the battlefield
As usual, I start by just loading the game into Il2CppInspector to see what happens:
The supplied metadata file is not valid.
This error means the
global-metadata.dat file doesn’t have the expected form. Specifically it starts with
the magic bytes (signature)
AF 1B B1 FA followed by a 32-bit integer containing the IL2CPP version number
(at the time of writing, a value from
0F-1B). This is followed up by a long list of offset/length pairs demarcating
the various metadata tables in the file – learn more in this article about IL2CPP’s load process.
Here is an example of the start of global-metadata.dat from an empty project:
Note that an “empty” Unity project still includes a pile of DLLs like
UnityEngine.dll and so on so it’s not really empty at all.
The header ends at offset 0x110 and this is location is also the start of the first table.
global-metadata.dat for Honkai Impact 4.3 (PC version):
Ouch, this doesn’t look very appetizing. At a casual glance it just looks encrypted or compressed, but there are actually some nuggets of data in here.
0x00-0x3F don’t make any obvious sense, and neither do the bytes from
onwards, but at least some of the data from
0x40-0x157 seems to mean something. We can surmise
this both from the fact there is a smattering of zeroes (low-entropy data), and that it at least vaguely resembles
the metadata header from the empty project. The areas around
seem garbled, but the rest does
seem like a set of file offsets and lengths.
You essentially have to determine this by carefully reading all the hex values by eye. Values are stored little-endian,
meaning that the first byte of a value is the least significant byte (LSB) (bits 0-7), and the final byte is the most
significant (MSB) (bits 24-31 in the case of 32-bit values). Given that the file is
0x0353C7DC bytes long,
we can try to verify that these suspected offset/length pairs do actually make sense. The final pointer at offset
is to offset
0x033AFECC, with a length specified at offset
This means the block pointed to ends at
0x035387D4, which is indeed inside the bounds of the file.
Let’s continue our investigation by scrolling down the file to see if the whole thing is encrypted or if there is anything else in plaintext.
There is a large block of garbled data starting around offset
0x158, and then around
it ends and we start to see normal metadata tables again:
Scrolling further, the rest of the file appears to contain normal data, except for one curious repeating pattern:
Every so often, there is a block of
0x40 garbled bytes in the middle of other data. After skipping around the file some more,
we determine this happens like clockwork every
We can determine from the offset/length lists in the header that these are not separate data structures, but embedded within valid lists.
Therefore we can assume we’re looking at encryption. We can rule out trivial schemes like single-byte XOR because the encrypted blocks
are high entropy (the distribution of values in the blocks is statistically even; see
so we are probably looking at strong encryption or a one-time pad
(OTP) – the latter could potentially be a XOR blob (a block of random bytes to be XORed with the encrypted data to decrypt it).
Is there an OTP key hiding in the file somewhere? Looking at the second screenshot above, we might surmise (looking at the
right hand three bytes on each of the four encrypted lines) that a XOR blob would contain sequential values
1E AE BE,
51 6D AD,
58 7A 03 and so on. We search for other occurrences of these in the file but come up blank.
The encryption may not be a XOR blob, or the XOR blob may be stored in the binary or an asset file, or the XOR blob may be obfuscated.
On this occasion we come up empty-handed, but it’s important to exclude obvious potentially easy paths before we get our hands dirty
analyzing assembly code, as it could save us a lot of time. We’re out of luck in this case though.
How far back in the file does this periodic block encryption go? The first encrypted offset we found is
(first of the two screenshots above), the block gap is
0x353C0 bytes. These two are exactly divisible with no remainder,
therefore it’s plausible to imagine the first encrypted block starts at
0x0 – ie. the very first byte in the file.
This also lines up with our earlier observation that bytes
0x00-0x3F are probably encrypted.
Let’s finish our analysis of the metadata by assessing the file’s coverage. In a normal
every byte is accounted for: that is to say, every single byte in the file is part of a header or table – there is no extraneous data.
We do this by taking all of the offset/length pairs in the header and merging them together to map out all of the used regions in the file,
then seeing if there is anything left over.
Why do we do this? Well, because hiding data in files is extremely common. In PE files (Windows
a highly common technique is to set the image size in the header to a value smaller than the true length of the file, and then add additional
hidden data at the end. This data could be secret code, decryption keys or anything else.
In this case, we are aware that some of the offsets and lengths may be encrypted, but we work with what we’ve got anyway:
1C7AA8 + 1BE4E4 = 385F8C
385F8C + 4E4A8 = 3D4434
3D4434 + 382CD8 = 75710C
75710C + 9040 = 76014C
(16 bytes of unknown data)
76014C + 10C0 = 76120C
76120C + 2398 = 7635A4
7635A4 + 3F7E50 = B5B3F4
B5B3F4 + 6F50 = B62344
B62344 + CEA58 = C30D9C
C30D9C + 25044 = C55DE0
C55DE0 + 994 = C56774
C56774 + A0B60 = CF72D4
CF72D4 + 56BB8 = D4DE8C
D4DE8C + 99E8 = D57874
D57874 + 7490 = D5ED04
D5ED04 + B84 = D5F888
(8 bytes of unknown data)
15C0D8C + 3B4AA0 = 197582C
197582C + 74C = 1975F78
(8 bytes of unknown data)
1975F78 + 5C5238 = 1F3B1B0
(16 bytes of unknown data)
1F3B1B0 + 13F8 = 1F3C5A8
1F3C5A8 + 1DA4 = 1F3E34C
1F3E34C + 139200 = 207754C
207754C + 11B9A00 = 3230F4C
3230F4C + 15294 = 32461E0
32461E0 + 169CEC = 33AFECC
(16 bytes of unknown data)
33AFECC + 188908 = 35387D4
This is a breakdown of the data from
The bytes at
0x35387D4-0x3538CDC (the end of the file)
are unaccounted for. We navigate to each of these offsets to see if there is anything of interest.
0x158-0x146480 contain probably encrypted data as mentioned earlier.
0x146480-0x1A7238 appear to
contain a single table (we know this because it consists of a long sequence of what appears to be offsets and lengths, in ascending order).
0x1A7238-0x1AC558 contain another similar table, and so on. These look like normal metadata tables.
contains the .NET symbol table (we know this because the data in this block is just human-readable strings). The most interesting block is
probably the end of the file – a pointer to itself (the offset at
0x35387D4 contains the value
four zeroes and then precisely
0x4000 of high entropy data – this may be encrypted data, or a decryption blob.
I haven’t included screenshots of everything here, but if your eyes are glazing over at all of these numbers right now,
that’s perfectly okay: the best way to follow all of this is simply to open the metadata file into a hex editor and explore
these file offsets for yourself. There is no special magic in how I determined these table boundaries: it is all determined
by eye, by looking carefully and methodically for obvious patterns in the data to indicate groups of related data together
in one place, and sudden changes in the data to indicate the boundaries between different kinds of data.
Let us now take a breath, step back and summarize what we’ve learned so far:
- There are
0x40-byte blocks of unknown encryption every
bytes, starting most likely from the beginning of the file
- There are some unknown pieces of data in the file header
- A normal metadata header for this version of IL2CPP is
The header here appears to be
0x158 bytes long. The total amount of
unknown data in the header is
0x40 bytes. This leaves a question mark over another 8 bytes.
- There are three blocks of data that are unaccounted for. One contains various metadata
tables and may be accounted for when we decrypt the first
0x40 bytes of the header.
The second contains the string table. The third contains unknown data with a precise size of
Whether or not this information will actually be useful down the line is another question. As it turns out, some of it is and some of it isn’t.
The key takeaway here is to just take a little bit of time to perform a superficial analysis of the data by eye and see what patterns can be spotted. Often,
this insight is enough to determine a strategy to decrypt a file on its own, but in this case we’re going to need to step up our game.
Fun fact: In 2017, small indie game company Blizzard Entertainment encrypted one of its
game’s main DLLs by using the standard Blowfish algorithm with the maximum 448-bit key size, and appending the key to the end of the DLL file.
Variations on this kind of technique are a timeless classic – be aware of it!