Implementing a native function detour in C#

9 min read Original article ↗

A few weeks ago I published Detours.Win32Metadata Nuget package containing a Win32 metadata for the detours library. When you combine it with CsWin32, you may easily generate PInvoke signatures for functions exported by the Detours library. Adding NativeAOT compilation, we are ready to implement a native function hook in C# and activate it in a remote process. In this post, we will create an example WriteFile WinAPI function hook. The full source code of the project resides in the detours-native-aot folder in my blog samples repository.

Implementing the hook DLL

We start by installing the required packages: Detours.Win32Metadata and Microsoft.Windows.CsWin32. Next, in the NativeMethods.txt file, we specify which PInvoke signatures we will use. In our sample hook, the list is not that long:

// Win32 functions
CreateFile
GetCurrentThread
GetModuleHandle
WriteFile

// Detours functions
DetourRestoreAfterWith
DetourUpdateThread
DetourAttach
DetourDetach
DetourTransactionBegin
DetourTransactionCommit
DetourFinishHelperProcess

// constants and enums
WIN32_ERROR

Notice, that I don’t need any special files or additional steps to generate PInvoke signatures for the Detours library. I just put them in the same file as other Win32 API functions. I love the simplicity of this solution and I am a big fan of the CsWin32 project since its launch 🙂 With the PInvokes generated, we are ready to define our WriteFile hook:

static class Hooks
{
    public static unsafe delegate* unmanaged[Stdcall]<HANDLE, byte*, uint, uint*, NativeOverlapped*, BOOL> OrigWriteFile = null;

    [UnmanagedCallersOnly(CallConvs = [typeof(CallConvStdcall)])]
    public static unsafe BOOL HookedWriteFile(HANDLE hFile, byte* lpBuffer, uint nNumberOfBytesToWrite, uint* lpNumberOfBytesWritten, NativeOverlapped* lpOverlapped)
    {
        Trace.WriteLine("HookedWriteFile");
        return OrigWriteFile(hFile, lpBuffer, nNumberOfBytesToWrite, lpNumberOfBytesWritten, lpOverlapped);
    }
}

The HookedWriteFile is the hook function, so the function that will detour (or in our case wrap) the original WinAPI WriteFile function. I needed to mark it with the UnmanagedCallersOnly attribute as it will be called directly from the native code (no marshaling). Therefore, its signature must match the native one. When creating a detour, we also need to save somewhere the address of the original function, so we could call it from our hook. The OrigWriteFile variable serves this purpose. It is a function pointer, so we can point it to a native address and call it like any other method (no need for delegates or marshaling). As you see, our hook does not do much – prints a string to the Debug output, but you may put here any logic you want, only making sure that it could be NativeAOT compiled (I will get to it in just a moment). It is time to implement functions that would activate and deactivate our hook in a remote process:

public static class Exports
{
    [UnmanagedCallersOnly(CallConvs = [typeof(CallConvStdcall)], EntryPoint = "InitHooks")]
    internal static void InitHooks()
    {
        unsafe
        {
            var kernel32Handle = GetModuleHandle("kernel32.dll");
            if (kernel32Handle == 0 || !NativeLibrary.TryGetExport(kernel32Handle, "WriteFile", out var funcAddress))
            {
                Trace.WriteLine($"Error resolving test function address (hmodule: 0x{kernel32Handle:x})");
                return;
            }

            var origFuncPtr = (void*)funcAddress;
            delegate* unmanaged[Stdcall]<HANDLE, byte*, uint, uint*, NativeOverlapped*, BOOL> hookedFunc = &Hooks.HookedWriteFile;

            PInvokeDetours.DetourRestoreAfterWith();

            ThrowIfError(PInvokeDetours.DetourTransactionBegin());
            ThrowIfError(PInvokeDetours.DetourUpdateThread(PInvokeWin32.GetCurrentThread()));
            ThrowIfError(PInvokeDetours.DetourAttach(&origFuncPtr, hookedFunc));
            ThrowIfError(PInvokeDetours.DetourTransactionCommit());

            Hooks.OrigWriteFile = (delegate* unmanaged[Stdcall]<HANDLE, byte*, uint, uint*, NativeOverlapped*, BOOL>)origFuncPtr;
        }

        static unsafe nint GetModuleHandle(string moduleName)
        {
            var moduleNamePtr = Marshal.StringToHGlobalUni(moduleName);
            try
            {
                return PInvokeWin32.GetModuleHandle(new PCWSTR((char*)moduleNamePtr));
            }
            finally
            {
                Marshal.FreeHGlobal(moduleNamePtr);
            }
        }
    }

    [UnmanagedCallersOnly(CallConvs = [typeof(CallConvStdcall)], EntryPoint = "RemoveHooks")]
    internal static void RemoveHooks()
    {
        unsafe
        {
            var origFuncPtr = (void*)Hooks.OrigWriteFile;
            Hooks.OrigWriteFile = null;

            delegate* unmanaged[Stdcall]<HANDLE, byte*, uint, uint*, NativeOverlapped*, BOOL> hookedFunc = &Hooks.HookedWriteFile;

            ThrowIfError(PInvokeDetours.DetourTransactionBegin());
            ThrowIfError(PInvokeDetours.DetourUpdateThread(PInvokeWin32.GetCurrentThread()));
            ThrowIfError(PInvokeDetours.DetourDetach(&origFuncPtr, hookedFunc));
            ThrowIfError(PInvokeDetours.DetourTransactionCommit());
        }
    }

    static void ThrowIfError(int err)
    {
        if (err != (int)WIN32_ERROR.NO_ERROR)
        {
            throw new System.ComponentModel.Win32Exception(err);
        }
    }
}

There is a lot going on here, so let me explain the important parts. The test loader that we will implement in the next section will load the hook DLL into the target process. In C++, we could use DllMain to setup the detours, but this approach won’t work for C#. Thus, the loader must explicitly call the InitHooks or RemoveHooks functions to activate or deactivate WriteFile function hook. In InitHooks, we obtain the address of the original WriteFile function and assign it to a function pointer. DetourAttach will later install the detour at this address and will also update the pointer’s value with an address of the newly created trampoline to the original WriteFile function. We then save this pointer’s value to the Hooks.OrigWriteFile static field, so we could use it in the hook.

To make InitHooks and RemoveHooks methods callable we need to export them. The easiest way to do so is by using an EntryPoint parameter of the UnmanagedCallersOnly attribute. The string that we assign to this parameter will be the name of the native export after we publish the DLL. If you need higher control over the generated exports (for example, you would like to assign specific ordinal numbers to exported functions), you have to disable the automatic export generation by setting the IlcExportUnmanagedEntrypoints to false in the project file and use IlcArg property to define linker arguments. Here is an example project file where we explicitly configure the exports, additionally adding DetourFinishHelperProcess to make it work with the DetourCreateProcessWithDlls function:

<Project Sdk="Microsoft.NET.Sdk">

    <PropertyGroup>
        <TargetFramework>net8.0-windows</TargetFramework>
        <RootNamespace>testdll</RootNamespace>
        <ImplicitUsings>enable</ImplicitUsings>
        <Nullable>enable</Nullable>
        <PublishAot>true</PublishAot>
        <AllowUnsafeBlocks>true</AllowUnsafeBlocks>
        <IlcExportUnmanagedEntrypoints>false</IlcExportUnmanagedEntrypoints>
    </PropertyGroup>

    <ItemGroup>
        <IlcArg Include="--export-dynamic-symbol:DetourFinishHelperProcess,@1" />
        <IlcArg Include="--export-dynamic-symbol:InitHooks,@2" />
        <IlcArg Include="--export-dynamic-symbol:RemoveHooks,@3" />
    </ItemGroup>

    <ItemGroup>
        <PackageReference Include="Detours.Win32Metadata" Version="4.0.1.12" />
        <PackageReference Include="Microsoft.Windows.CsWin32" Version="0.3.106">
            <PrivateAssets>all</PrivateAssets>
            <IncludeAssets>runtime; build; native; contentfiles; analyzers; buildtransitive</IncludeAssets>
        </PackageReference>
    </ItemGroup>
</Project>

The last step is to publish the DLL:

dotnet public -c Debug -r win-x64

The above command should produce a natively compiled DLL. Win-x86 is not available as a publish target for NativeAOT in .NET 8, but support for it is coming in .NET 9. The loader we will implement in the next section, will support both 32-bit and 64-bit hooking DLLs.

If we also want to statically link the detours library instead of using an external DLL, we need to simply add the following content to the csproj file:

<ItemGroup>
    <DirectPInvoke Include="detours" />
</ItemGroup>

Implementing a test loader

Our first task is to inject a DLL into a remote process. We can easily do that by using the CreateRemoteThread function, but first we need to know the address of kernel32!LoadLibraryW. We will start by getting the kernel32 base address (handle) in the remote process:

static unsafe (HMODULE, string moduleFullPath) GetModuleInfo(HANDLE processHandle, bool isWow64, string moduleName)
{
    const uint MaxModulesNumber = 256;

    var moduleHandles = stackalloc HMODULE[(int)MaxModulesNumber];
    uint cb = MaxModulesNumber * (uint)Marshal.SizeOf<HMODULE>();
    uint cbNeeded = 0;

    PInvoke.EnumProcessModulesEx(processHandle, moduleHandles, cb, &cbNeeded,
        isWow64 ? ENUM_PROCESS_MODULES_EX_FLAGS.LIST_MODULES_32BIT : ENUM_PROCESS_MODULES_EX_FLAGS.LIST_MODULES_64BIT);

    if (cb >= cbNeeded)
    {
        moduleName = Path.DirectorySeparatorChar + moduleName.ToUpper();
        var nameBuffer = stackalloc char[(int)PInvoke.MAX_PATH];
        foreach (var iterModuleHandle in new Span<HMODULE>(moduleHandles, (int)(cbNeeded / Marshal.SizeOf<HMODULE>())))
        {
            if (PInvoke.GetModuleFileNameEx(processHandle, iterModuleHandle, nameBuffer,
                    PInvoke.MAX_PATH) is var iterModuleNameLength && iterModuleNameLength > moduleName.Length)
            {
                var iterModuleNameSpan = new Span<char>(nameBuffer, (int)iterModuleNameLength);
                if (IsTheRightModule(iterModuleNameSpan))
                {
                    return (iterModuleHandle, new string(iterModuleNameSpan));
                }
            }
        }
    }

    return ((HMODULE)nint.Zero, "");

    bool IsTheRightModule(ReadOnlySpan<char> m)
    {
        var moduleNameSpan = moduleName.AsSpan();
        for (int i = 0; i < moduleNameSpan.Length; i++)
        {
            if (char.ToUpper(m[i + m.Length - moduleNameSpan.Length]) != moduleNameSpan[i])
            {
                return false;
            }
        }
        return true;
    }
}

The 32-bit processes (WOW64) contain both 32-bit and 64-bit system DLLs. The above method will pick the system DLL version with the same bittness as the target process. Unfortunately, the module path returned by GetModuleFileNameEx might be incorrect in the 64-bit context (for example, C:\Windows\System32 instead of C:\Windows\SysWOW64), so we will set it manually (I did not find a WinAPI function that would return a valid 64-bit path, so please leave a comment if you know one):

string systemDirectory = isWow64 ? Environment.GetFolderPath(Environment.SpecialFolder.SystemX86) : Environment.SystemDirectory;

And we will use the PEReader class to find the offset of the LoadLibraryW function:

static unsafe uint GetModuleExportOffset(string modulePath, string procedureName)
{
    using var pereader = new PEReader(File.OpenRead(modulePath));

    var exportsDirEntry = pereader.PEHeaders.PEHeader!.ExportTableDirectory;
    var exportsDir = (IMAGE_EXPORT_DIRECTORY*)pereader.GetSectionData(exportsDirEntry.RelativeVirtualAddress).Pointer;

    var functionNamesRvas = new Span<uint>(pereader.GetSectionData((int)exportsDir->AddressOfNames).Pointer,
                                            (int)exportsDir->NumberOfNames);
    var functionNamesOrdinals = new Span<ushort>(pereader.GetSectionData((int)exportsDir->AddressOfNameOrdinals).Pointer,
                                                    (int)exportsDir->NumberOfNames);
    var addressOfFunctions = pereader.GetSectionData((int)exportsDir->AddressOfFunctions).Pointer;

    for (int i = 0; i < functionNamesRvas.Length; i++)
    {
        var name = Marshal.PtrToStringAnsi((nint)pereader.GetSectionData((int)functionNamesRvas[i]).Pointer);
        var index = functionNamesOrdinals[i];

        if (name == procedureName)
        {
            return *(uint*)(addressOfFunctions + index * sizeof(uint));
        }
    }

    return 0;
}

We will also define helper functions to work with memory in the remote process and to create a remote thread:

static void CallFunctionInRemoteProcess(HANDLE processHandle, nint fnAddress, nint arg0 = 0)
{
    unsafe
    {
        if ((HANDLE)CreateRemoteThread(processHandle, null, 0, fnAddress, arg0, 0, null) is var remoteThreadHandle &&
            remoteThreadHandle == (HANDLE)0)
        {
            throw new Win32Exception(Marshal.GetLastWin32Error());
        }

        try
        {
            if (PInvoke.WaitForSingleObject(remoteThreadHandle, 5000) is var err && err == WAIT_EVENT.WAIT_TIMEOUT)
            {
                throw new Win32Exception((int)WIN32_ERROR.ERROR_TIMEOUT);
            }
            else if (err == WAIT_EVENT.WAIT_FAILED)
            {
                throw new Win32Exception(Marshal.GetLastWin32Error());
            }
        }
        finally
        {
            PInvoke.CloseHandle(remoteThreadHandle);
        }
    }
}

static nint AllocAndWriteData(HANDLE remoteProcessHandle, Span<byte> data)
{
    unsafe
    {
        var allocAddr = PInvoke.VirtualAllocEx(remoteProcessHandle, null, (nuint)data.Length,
            VIRTUAL_ALLOCATION_TYPE.MEM_RESERVE | VIRTUAL_ALLOCATION_TYPE.MEM_COMMIT, PAGE_PROTECTION_FLAGS.PAGE_READWRITE);
        if (allocAddr != null)
        {
            // VirtualAllocEx initializes memory to 0
            fixed (void* dataPtr = data)
            {
                if (!PInvoke.WriteProcessMemory(remoteProcessHandle, allocAddr, dataPtr, (nuint)data.Length, null))
                {
                    throw new Win32Exception(Marshal.GetLastWin32Error());
                }
            }
            return (nint)allocAddr;
        }
        else
        {
            throw new Win32Exception(Marshal.GetLastWin32Error());
        }
    }
}

static void FreeMemory(HANDLE remoteProcessHandle, nint allocAddr)
{
    unsafe
    {
        PInvoke.VirtualFreeEx(remoteProcessHandle, (void*)allocAddr, 0, VIRTUAL_FREE_TYPE.MEM_RELEASE);
    }
}

And we are ready to implement the tests:

const uint processId = 12345; // set it to a valid process  Id

const PROCESS_ACCESS_RIGHTS AccessRightsForCreatingRemoteThread = PROCESS_ACCESS_RIGHTS.PROCESS_CREATE_THREAD |
        PROCESS_ACCESS_RIGHTS.PROCESS_QUERY_INFORMATION | PROCESS_ACCESS_RIGHTS.PROCESS_VM_OPERATION |
        PROCESS_ACCESS_RIGHTS.PROCESS_VM_WRITE | PROCESS_ACCESS_RIGHTS.PROCESS_VM_READ;

[Test]
public static void SetHook()
{
    // find a process to hook
    var processHandle = PInvoke.OpenProcess(AccessRightsForCreatingRemoteThread, false, processId);

    bool isWow64 = IsWow64(processHandle);

    string systemDirectory = isWow64 ? Environment.GetFolderPath(Environment.SpecialFolder.SystemX86) : Environment.SystemDirectory;

    // load a hook dll
    var hookDllName = isWow64 ? "hook_x86.dll" : "hook_x64.dll";
    var hookDllPath = Path.Combine(AppContext.BaseDirectory, hookDllName);
    var allocAddr = AllocAndWriteData(processHandle, Encoding.Unicode.GetBytes(hookDllPath + "\0").AsSpan());
    try
    {
        Assert.That(GetModuleInfo(processHandle, isWow64, "kernel32.dll") is var (kernel32Handle, _) && kernel32Handle != 0);

        var kernel32Path = Path.Combine(systemDirectory, "kernel32.dll");
        var fnLoadLibraryW = kernel32Handle + (nint)GetModuleExportOffset(kernel32Path, "LoadLibraryW");
        CallFunctionInRemoteProcess(processHandle, fnLoadLibraryW, allocAddr);
    }
    finally
    {
        FreeMemory(processHandle, allocAddr);
    }

    // set hooks
    Assert.That(GetModuleInfo(processHandle, isWow64, hookDllName) is var (hookDllHandle, _) && hookDllHandle != 0);
    var fnInitHooks = hookDllHandle + (nint)GetModuleExportOffset(hookDllPath, "InitHooks");
    CallFunctionInRemoteProcess(processHandle, fnInitHooks);
}

[Test]
public static void UnsetHook()
{
    var processHandle = PInvoke.OpenProcess(AccessRightsForCreatingRemoteThread, false, processId);

    bool isWow64 = IsWow64(processHandle);
    
    var hookDllName = isWow64 ? "hook_x86.dll" : "hook_x64.dll";
    var hookDllPath = Path.Combine(AppContext.BaseDirectory, hookDllName);

    // unset hooks
    Assert.That(GetModuleInfo(processHandle, isWow64, hookDllName) is var (hookDllHandle, _) && hookDllHandle != 0);
    var fnInitHooks = hookDllHandle + (nint)GetModuleExportOffset(hookDllPath, "RemoveHooks");
    CallFunctionInRemoteProcess(processHandle, fnInitHooks);
}

Now, try to run any process that writes to a file. Set the process ID in the test and run SetHook. You should start seeing HookedWriteFile messages in the system global debug output (for example, you may use DebugView for this purpose). Of course, setting the process ID manually is not the best way to do so and surely not automatic, but it’s only a POC so please close your eyes to that 🙂