Reverse Engineering Adventures: Honkai Impact 3rd (IDA Decompiler Techniques) (Part 2)

This is a continuation of the Reverse Engineering Adventures: Honkai Impact 3rd mini-series – read part 1 first! In this article, we’ll look at comparative data deobfuscation and how to work with the IDA decompiler.

>>Recap

When we left off our previous exploits, we had peeled off the first layer of encryption from global-metadata.dat and found the call site which calls the decryption function. This turned out to correspond to il2cpp::vm::MetadataLoader::LoadMetadataFile from the IL2CPP source code, with an added line of code to invoke the decryption.

We can’t load the metadata file into Il2CppInspector yet though, because the header does not conform to the expected format. Extra – potentially still encrypted – data is present, and the header length is 0x158 rather than 0x110 bytes, which means that the locations of some or all of the header fields has been changed. Additionally, while most of the rest of the file looks normal, there are no string literals – which are normally present in global-metadata.dat – and a large block of presumably encrypted data right after the header.

global-metadata.dat starts with a header which is a struct called Il2CppGlobalMetadataHeader that contains a list of offsets and lengths for all of the tables in the file. The standard header corresponding to Honkai Impact’s Unity version can be found in the IL2CPP source code at libil2cpp/il2cpp-metadata.h and looks like this (comments taken directly from the source code):

typedef struct Il2CppGlobalMetadataHeader
{
    int32_t sanity;
    int32_t version;
    int32_t stringLiteralOffset; // string data for managed code
    int32_t stringLiteralCount;
    int32_t stringLiteralDataOffset;
    int32_t stringLiteralDataCount;
    int32_t stringOffset; // string data for metadata
    int32_t stringCount;
    int32_t eventsOffset; // Il2CppEventDefinition
    int32_t eventsCount;
    int32_t propertiesOffset; // Il2CppPropertyDefinition
    int32_t propertiesCount;
    int32_t methodsOffset; // Il2CppMethodDefinition
    int32_t methodsCount;
    int32_t parameterDefaultValuesOffset; // Il2CppParameterDefaultValue
    int32_t parameterDefaultValuesCount;
    int32_t fieldDefaultValuesOffset; // Il2CppFieldDefaultValue
    int32_t fieldDefaultValuesCount;
    int32_t fieldAndParameterDefaultValueDataOffset; // uint8_t
    int32_t fieldAndParameterDefaultValueDataCount;
    int32_t fieldMarshaledSizesOffset; // Il2CppFieldMarshaledSize
    int32_t fieldMarshaledSizesCount;
    int32_t parametersOffset; // Il2CppParameterDefinition
    int32_t parametersCount;
    int32_t fieldsOffset; // Il2CppFieldDefinition
    int32_t fieldsCount;
    int32_t genericParametersOffset; // Il2CppGenericParameter
    int32_t genericParametersCount;
    int32_t genericParameterConstraintsOffset; // TypeIndex
    int32_t genericParameterConstraintsCount;
    int32_t genericContainersOffset; // Il2CppGenericContainer
    int32_t genericContainersCount;
    int32_t nestedTypesOffset; // TypeDefinitionIndex
    int32_t nestedTypesCount;
    int32_t interfacesOffset; // TypeIndex
    int32_t interfacesCount;
    int32_t vtableMethodsOffset; // EncodedMethodIndex
    int32_t vtableMethodsCount;
    int32_t interfaceOffsetsOffset; // Il2CppInterfaceOffsetPair
    int32_t interfaceOffsetsCount;
    int32_t typeDefinitionsOffset; // Il2CppTypeDefinition
    int32_t typeDefinitionsCount;
    int32_t rgctxEntriesOffset; // Il2CppRGCTXDefinition
    int32_t rgctxEntriesCount;
    int32_t imagesOffset; // Il2CppImageDefinition
    int32_t imagesCount;
    int32_t assembliesOffset; // Il2CppAssemblyDefinition
    int32_t assembliesCount;
    int32_t metadataUsageListsOffset; // Il2CppMetadataUsageList
    int32_t metadataUsageListsCount;
    int32_t metadataUsagePairsOffset; // Il2CppMetadataUsagePair
    int32_t metadataUsagePairsCount;
    int32_t fieldRefsOffset; // Il2CppFieldRef
    int32_t fieldRefsCount;
    int32_t referencedAssembliesOffset; // int32_t
    int32_t referencedAssembliesCount;
    int32_t attributesInfoOffset; // Il2CppCustomAttributeTypeRange
    int32_t attributesInfoCount;
    int32_t attributeTypesOffset; // TypeIndex
    int32_t attributeTypesCount;
    int32_t unresolvedVirtualCallParameterTypesOffset; // TypeIndex
    int32_t unresolvedVirtualCallParameterTypesCount;
    int32_t unresolvedVirtualCallParameterRangesOffset; // Il2CppRange
    int32_t unresolvedVirtualCallParameterRangesCount;
    int32_t windowsRuntimeTypeNamesOffset; // Il2CppWindowsRuntimeTypeNamePair
    int32_t windowsRuntimeTypeNamesSize;
    int32_t exportedTypeDefinitionsOffset; // TypeDefinitionIndex
    int32_t exportedTypeDefinitionsCount;
} Il2CppGlobalMetadataHeader;

The precise meanings of all these tables doesn’t matter for our purposes, but to enable the metadata file to be loaded by Il2CppInspector, we either need to construct a new Il2CppGlobalMetadataHeader struct whose layout matches that of the file, or rewrite the file’s header to match the original header layout. Each method has pros and cons, but in this case it is much easier to just edit the struct and leave the file alone, and you should generally prefer non-destructive techniques where possible. We don’t know what the extra 0x48 bytes of data is yet and we might need it later.

How do we determine the correct ordering? There are two main ways, and they both suck:

  • Compare the tables in our metadata file with the one we created for the empty project, working through each table in the obfuscated file, looking for clustered patterns of similar data in the empty project metadata file, correlating the file location against the table list in empty project metadata header to see which table it is, and adding it to the struct; the number of cross-comparisons can be cut down by also referring to the IL2CPP metadata struct definitions (see below)
  • Reverse engineering every IL2CPP function in the game assembly that uses a previously unread part of the metadata file to determine what file offsets it uses, and correlating it with the publicly available IL2CPP source code to see which table it is (if necessary)

Yikes. Luckily, having the IL2CPP library source code available plus the ability to generate arbitrary metadata files on demand with Unity makes our task much easier, but either approach will be time-consuming and error-prone.

In this article I’m going to focus on the second approach, but for illustration purposes, let’s find one table by way of example using the first technique.