Thursday, December 12, 2024

Windows Tooling Updates: OleView.NET

Posted by James Forshaw, Google Project Zero

This is a short blog post about some recent improvements I've been making to the
OleView.NET tool which has been released as part of version 1.16. The tool is designed to discover the attack surface of Windows COM and find security vulnerabilities such as privilege escalation and remote code execution. The updates were recently presented at the Microsoft Bluehat conference in Redmond under the name "DCOM Research for Everyone!". This blog expands on the topics discussed to give a bit more background and detail that couldn't be fit within the 45-minute timeslot. This post assumes a knowledge of COM as I'm only going to describe a limited number of terms.

Using the OleView.NET Tooling

Before we start the discussion it's important to understand how you can get hold of the OleView.NET tool and some basic usage. The simplest way to get the tooling is to install it from the PowerShell gallery with the Install-Module OleViewDotNet command. This installs both the PowerShell module and the GUI.

Next you need to parse the COM registration artifacts into an internal database. You can do this by running the Get-ComDatabase command. Once it's finished you're ready to go. You will notice that it can take a long time to complete, so it'd be annoying to have to do this every time you want to start researching. For that reason you can use the command Set-ComDatabase -Default to write out the database to a default storage location. Now the next time you start PowerShell you can just run an inspection command, such as Get-ComClass and the default database will be automatically loaded.

This default database is also shared with the GUI, which you can start by running the Show-ComDatabase command. For general research I find the GUI to be easier to use and you can click around and look at the COM registration information. For analysis, the ability to script through PowerShell is more important.

Researching COM Services

Performing security research in COM usually involves the following steps::

  • Enumerate potential COM classes of interest. These might be classes which are accessible outside of a sandbox, running at high privilege or designed to be remotely exposed.
  • Validate whether the COM classes are truly accessible from the attack position. COM has various security controls which determine what users can launch, activate and access an object. Understanding these security controls allows the list of COM classes of interest to be limited to only those that are actually part of the attack surface.
  • Enumerate exposed interfaces, determine what they do and call methods on them to test for security vulnerabilities.

The last step is the focus of the updates to the tooling, making it easier to determine what an exposed interface does and call methods to test the behavior. The goal is to minimize the amount of reverse engineering needed (although generally some is still required) as well as avoid needing to write code outside of the tooling to interact with the COM service under test.

To achieve this goal, OleView.NET will pull together any sources of interface information it has, then provide a mechanism to inspect and invoke methods on the interface through the UI or via PowerShell. The sources of information that it currently pulls together are:

  1. Known interfaces, either defined in the base .NET framework class libraries or inside OleView.NET.
  2. COM interface definitions present in the Global Assembly Cache.
  3. Registered type libraries.
  4. Windows Runtime interfaces.
  5. Extracted proxy class marshaling information.

One useful benefit of gathering this information, is that the tool formats the interface as "source code" so you can manually inspect it.

Formatting Interfaces Definitions

The OleView.NET tool uses a database object to represent all the artifacts it has analyzed on your system. The latest released version defines some of these objects to be convertible to "source code". For example the following can be converted if the tool can determine some meta data that to represent the artifact:

  • COM interfaces
  • COM proxies
  • COM Windows Runtime classes.
  • Type libraries, interfaces and classes.

How you get to this conversion depends on whether you're using the PowerShell or the UI. The simplest approach is PowerShell, using the ConvertTo-ComSourceCode command. For example, the following will convert an interface object into source code:

PS> Get-ComInterface -Name IMyInterface | ConvertTo-ComSourceCode -Parse

Note that we also need to pass a -Parse option to the command. Some metadata such as type libraries and proxies can be expensive to parse so it won't do that automatically. However, once they're been parsed in the current session the metadata is cached for further use, so for example if you formatted a single interface in a type library, all other interfaces are now also parsed and can be formatted.

The output of this command is the converted "source code" as text. The format depends on metadata source. For example the following is the output from a Windows Runtime type:

[Guid("155eb23b-242a-45e0-a2e9-3171fc6a7fdd")]

interface IUserStatics

{

    /* Methods */

    UserWatcher CreateWatcher();

    IAsyncOperation<IReadOnlyList<User>> FindAllAsync();

    IAsyncOperation<IReadOnlyList<User>> FindAllAsync(UserType type);

    IAsyncOperation<IReadOnlyList<User>> FindAllAsync(UserType type, 

                                       UserAuthenticationStatus status);

    User GetFromId(string nonRoamableId);

}

As Windows Runtime types are defined using metadata similar to .NET then the output is a pseudo C# format. In contrast for type library or proxy it's look more like the following:

[

    odl,

    uuid(00000512-0000-0010-8000-00AA006D2EA4),

    dual,

    oleautomation,

    nonextensible

]

interface _Collection : IDispatch {

    [id(1), propget]

    HRESULT Count([out, retval] int* c);

    [id(0xFFFFFFFC), restricted]

    HRESULT _NewEnum([out, retval] IUnknown** ppvObject);

    [id(2)]

    HRESULT Refresh();

};

This is in the Microsoft Interface Definition Language (MIDL) format, the type library version should be pretty accurate and could even be recompiled by the MIDL compiler. For proxies some of the information is lost and so the MIDL generated isn't completely accurate, but as we'll see later there's limited reasons to take the output and recompile.

Another thing to note is that proxies lose name information when compiled from MIDL to their C marshaled representation. Therefore the tool just generates placeholder names, for example, method names are of the form "ProcN". If the proxy is for a type that has a known definition, such as from a Windows Runtime type or a type library then the tool will try and automatically apply the names. If not, you'll need to manually change them if you want them to be anything other than the default.

You can change the names from PowerShell by modifying the proxy object directly. For example the "IBitsTest1" interface looks like the following before doing anything:

[

  object,

  uuid(51A183DB-67E0-4472-8602-3DBC730B7EF5),

]

interface IBitsTest1 : IUnknown {

    HRESULT Proc3([out, string] wchar_t** p0);

}

You can modify "Proc3" with the following script:

PS> $proxy = Get-ComProxy -Iid 51A183DB-67E0-4472-8602-3DBC730B7EF5

PS> $proxy.Procedures[0].Name = "GetBitsDllPath"

PS> $proxy.Procedures[0].Parameters[0].Name = "DllPath"

Now the formatted output looks like the following:

[

  object,

  uuid(51A183DB-67E0-4472-8602-3DBC730B7EF5),

]

interface IBitsTest1 : IUnknown {

    HRESULT GetBitsDllPath([out, string] wchar_t** DllPath);

}

This renaming will also be important when we come back to calling proxied methods. Obviously it'd be annoying to run this script every time, so you can cache the names using the following command:

PS> Export-ComProxyName -Proxy $p -ToCache

This will write out a file describing the names to a local cache file. When the proxy is loaded again in another session this cache file will be automatically applied. The Export-ComProxyName and corresponding Import-ComProxyName commands allow you to read and write XML or JSON files representing the proxy names which you can modify in a text editor if that's easier.

One of the quickest wins is to enumerate the interfaces for a COM object, then pass the output of that through the ConvertTo-ComSource code command. For example:

PS> $obj = New-ComObject -Clsid 4575438f-a6c8-4976-b0fe-2f26b80d959e

PS> Get-ComInterface -Object $obj | ConvertTo-ComSourceCode -Parse

This creates a new COM object based on its CLSID, enumerates the interfaces it supports and then passes them through the conversion process to get out a "source code" representation of the interfaces.

To view the source code in the GUI you first need to open one of the database views from the Registry menu. In the resulting window, there will be a tree view of artifacts. You need to open the source code viewer window by right clicking the tree and selecting the Show Source Code option in the context menu. This will result in a view similar to the following:

Screenshot showing the tooling, with the resulting window as described in the paragraph above

You can also automatically enable the source code view from the View→Registry View Options menu. In that menu you can also enable automatically parsing the interface information, which is off by default.

You might notice in the screenshot that there's some text which is underlined. This indicates names which can be changed, and it is only used for proxies. You can right click the name and choose Edit Name from the context menu to bring up a text entry dialog. You can then change the name to suit. If you want to persist the names between sessions then set the Save Proxy Names on Exit option in the registry view options. Then when you exit any modified proxies will be written to the cache.

If you want to edit a proxy from PowerShell in a similar GUI you can use following command:

PS> Edit-ComSourceCode $proxy

This will show a dialog similar to the following where you can do edits to the proxy name information:

Screenshot showing where you can do edits to the proxy name

Genering Interfaces from a Proxy Definition

Now on to the more important side of these updates, the ability to invoke methods on the interfaces exposed by an object you want to research. The tool has always given you some ability to invoke methods as long as the object has a .NET interface to call through reflection. This could either be through a known interface type, such as a built-in one or the Windows Runtime interfaces or by converting a type library into a .NET assembly on demand.

What's new is the ability to generate an interface based on a proxy definition and then use that to invoke methods. Initially I tried to implement this by generating an .NET interface dynamically which would then use the existing .NET interop to call the proxy methods. This worked fine for simple proxies but quickly hit problems when doing anything more complex:

  • Some types are hard to represent in easy to use .NET types, such as pointers to structures. This is "handled" in the type library converter by just exporting them as IntPtr parameters which means the caller has to manually marshal the data. Get this wrong and the tool crashes.
  • Any structures need to be accurately laid out so the native marshaler can read and write to the correct field locations. Get this wrong and the tool crashes.
  • Did I mention that if you get this wrong the tool crashes?

Fortunately I already had a solution, my sandbox library already had the ability to dynamically generate a .NET class from parsed NDR data, in fact I was already using the library to parse the NDR data for proxies so I realized could I repurpose the existing client builder for COM proxy clients. I needed to do some simple refactoring of the code to make it build from a COM proxy instance rather than an RPC server, but I quickly had an RPC client. This RPC client doesn't directly interact with any native marshaling code, so it's unlikely to crash. Also any complex structures are built in a way which makes it easy to modify from .NET removing the problems around pointers. One issue with using the RPC client method is the same interface could be used for both in-process and out-of-process objects. Due to the way COM is designed a client usually doesn't need to care about where the object is, but in this case it must be accessible via a proxy. This isn't that big an issue, there's no security boundary between in-process COM objects and so being able to call methods on them isn't that interesting.

The next problem was the RPC transport. COM calls have an additional input and output parameter, the ORPTHIS and ORPCTHAT structures, that need to be added to the call. These parameters could have been added to the RPC client, but it would seem best to make the clients agnostic of the transport. Instead as my RPC code has pluggable RPC transport I was able to reimplement a custom version over the top of the existing ALPC and TCP transports which added the additional parameters to any call. That wasn't the end of it though, ALPC needs an additional pair of parameters, LocalThis and LocalThat, which are potentially different depending on versions of Windows. Also you need to add support for additional services such as the OXID resolver and communication with the local DCOM activator. While I implemented all this it wasn't as reliable as I'd like, however it's still present in the source code if you want to play with it.

As an aside, I should point out that Clement Rouault, one of the original researchers into ALPC RPC protocol of which parts of my own implementation is inspired, recently released a very similar project for their Python tooling which implements the ALPC DCOM protocol.

I decided that I'd need a different approach, in the COM runtime the RPC channel used by a proxy instance is represented by the IRpcChannelBuffer interface. An object implementing this interface is connected to the proxy during initialization, it is then used to send and receive NDR formatted data from the client to the server. The implementation handles all the idiosyncrasies such as the additional parameters, handling OXID resolving and reference counting. If we could get hold of a proxy object's instance of the IRpcChannelBuffer object, we could use that instead of implementing our own protocol, the challenge was how to get it.

After a bit of research I found that we can use the documented NdrProxyInitialize function to get hold of the interface from its MIDL_STUB_MESSAGE structure by passing in the interface pointer to a proxy. While it wouldn't be as flexible as a fully custom implementation this gave me an easy way to handle the transport without worrying about platform or protocol differences. It could also work from an existing COM object, just query the appropriate interface, extract the buffer and make calls to the remote server.

Of course nothing is that simple, I discovered that while the IRpcChannelBuffer object is a COM object it has a broken implementation of IMarshal. As .NET's COM interop tries to query for IMarshal when generating a Runtime Callable Wrapper, it will immediately crash the process. I had to manually dispatch the calls to the method through native delegates, but at least it works.

Calling Interface Methods

Okay, so how do you use the tool to call arbitrary methods? For the GUI it works like it always has, when you create an instance of a COM object, usually by right clicking an entry in a view and selecting Create Instance you'll get a new object information window similar to the following:

Screenshot showing what happens when you right-click on an entry in a view and select create instance

At the bottom of the window is a list of supported interfaces. In the right column is an indicator if there's a viewer for that interface. If it's set to Yes, then you can double click it to bring up an invocation window like the following:

Screenshot of the OleView .NET tooling showing the invoked method

From this window you can double click a method to bring up a new dialog where you can specify the arguments and invoke the method as shown below.

Screenshot of Invoke GetBitsDllPath showing that the operation completed successfully

Once invoked it'll show the resulting output parameters and if the return value is an integer will assume it's a HRESULT error code. These windows are the same for "reflected" interfaces such as type libraries and Windows Runtime interfaces as well as proxy clients. The names of proxy methods won't be automatically updated if you change them when the interface window is open. You'll need to go back to the object information window and double click the interface again to get it to recreate the client.

For PowerShell you can specify an Iid argument when using the New-ComObject command or use the Get-ComObjectInterface command to query an existing COM object for a new interface. The tooling will pick the best option for calling the interface from the options available to it, including generating the RPC client dynamically.

PS> $obj = New-ComObject -Clsid 4991D34B-80A1-4291-83B6-3328366B9097

PS> $test = Get-ComObjectInterface $o -Iid 51A183DB-67E0-4472-8602-3DBC730B7EF5

PS> $test.GetBitsDllPath()

DllPath                      retval

-------                      ------

c:\windows\system32\qmgr.dll      0

To make it easier to call interface methods from PowerShell the exposed methods on the object will be modified to wrap output parameters in a single return value. You can see this in the listing above, the DllPath parameter was originally an output only parameter. Rather than deal with  that in the script a return structure was automatically created containing the DllPath as well as the HRESULT return value. If the parameter is an input and output then the method signature accepts the input value and the return value contains the output value.

If the definitions for your interface don't already exist you can import them into the tool to be used by the automatic interface selection. To do this you'll need to define the interfaces as .NET types and compile them into an assembly. Then in the GUI use the File→Import Interop Assembly menu option or for PowerShell use the Add-ComObjectInterface command. Both of these options allow you to specify the assembly will be automatically loaded the next time you start the tool. This will make a copy of the DLL to a central location so that it can be accessed even if you delete the library later.

If all you have is an IDL file for a set of COM interfaces you can import them into the tool indirectly with help from the Windows SDK. First compile the IDL file using the MIDL compiler to generate a type library, then use the TLBIMP command to generate an Assembly file from the type library. Finally you can import it using the previous paragraph's methods.

There's plenty to discover in OleView.NET which I've not covered here. I'd encourage you to play around, or check out the source code on github.

Thursday, November 21, 2024

Simple macOS kernel extension fuzzing in userspace with IDA and TinyInst

Posted by Ivan Fratric, Google Project Zero

Recently, one of the projects I was involved in had to do with video decoding on Apple platforms, specifically AV1 decoding. On Apple devices that support AV1 video format (starting from Apple A17 iOS / M3 macOS), decoding is done in hardware. However, despite this, during decoding, a large part of the AV1 format parsing happens in software, inside the kernel, more specifically inside the AppleAVD kernel extension (or at least, that used to be the case in macOS 14/ iOS 17). As fuzzing is one of the techniques we employ regularly, the question of how to effectively fuzz this code inevitably came up.

It should be noted that I wasn’t the first person to look into the problem of Apple kernel extension fuzzing, so before going into the details of my approach, other projects in this space should be mentioned.

In the Fairplay research project, @pwn0rz utilized a custom loader to load the kernel extension into userspace. A coworker tried to run this code on the current AppleAVD extension, however it didn’t work for them (at least not out of the box) so we didn’t end up using it. It should be noted here that my approach also loads the kernel code into userspace, albeit in a more lightweight way.

In the Cinema time! presentation at Hexacon 2022, Andrey Labunets and Nikita Tarakanov presented their approach for fuzzing AppleAVD where the decompiled code was first extracted using IDA and then rebuilt. I used this approach in the past in some more constrained scenarios, however the decompiled code from IDA is not perfect and manual fixing was often required (such as, for example, when IDA would get the stack layout of a function incorrectly).

In the KextFuzz project, Tingting Yin with the co-authors statically instrumented kernel extensions by replacing pointer authentication instructions with a jump to a coverage-collecting trampoline, which results in a partial coverage.

Most recently, the Pishi project by Meysam Firouzi was released just before this research. The project statically instruments kernel extension code by using Ghidra to identify all basic blocks, and then replacing one instruction from each basic block with a branch to a dedicated trampoline. The trampoline records the coverage, executes the replaced instruction and jumps back to the address of the next instruction. This was reported to run on a real device.

Given the existence of these other projects, it is worth saying that my goal was not to create necessarily the “best” method for kernel extension fuzzing, but what for me was the simplest (if we don’t count the underlying complexity of the off-the shelf tools being used). In short, my approach, that will be discussed in detail in other sections, was

  1. Load AppleAVD extension or full kernelcache into IDA
  2. Rebase the module to an address that can be reliably allocated in userspace
  3. Export raw memory using an IDA Python script
  4. Load exported bytes using custom loader
  5. Use custom TinyInst module to hook and instrument the extension
  6. Use Jackalope for fuzzing

All the project code can be found here. Various components will be explained in more detail throughout the rest of the blog post.

Extracting kernel extension code

Normally, on macOS, kernel extensions are packaged inside “kernel collections” files that serve as containers for multiple extensions. At first OS boot (and whenever something is changed with regards to kernel extensions), the kernel extensions needed by the machine are repackaged into what is called the “kernel cache” (kernelcache file on the filesystem). Kernel extensions can be extracted from these caches and collections, but existing tooling can’t really produce individual .dylib files that can be loaded into userspace and run without issues.

However, reverse engineering tooling, specifically IDA Pro which I used in this research, comes with a surprisingly good loader for Apple kernel cache. I haven’t tried how other reverse engineering tools compare, but if they are comparable and someone would like to contribute to the project, I would gladly accept export scripts for these other tools.

So, instead of writing our own loader, we can simply piggyback on IDA’s. The idea is simple:

  • we let IDA load the kernel extension we want (or even the entire kernelcache)
  • we use IDA to rebase the code so it’s in memory range that is mappable in userspace (see image)
  • using a simple IDA Python script, we export for each memory segment its start address, end address, protection flags and raw bytes
  • optionally, we can also, using the same script, export all the symbol names and the corresponding addresses so we can later refer to symbols by name

The following image shows rebasing of the kernel extension. This functionality is accessible in IDA via Edit->Segments->Rebase program… menu. When choosing the new base address, it is convenient to only change the high bits which makes it easy to manually convert rebased to original addresses and vice versa when needed. In the example below the image base was changed from 0xFFFFFE000714C470 to  0xAB0714C470.

screenshot with image base selected, with the value set at  0xAB0714C470, with both the fix up the program and rebase the whole image options selected

Figure 1: Rebasing the extension

The IDA script for exporting the data can be found here. You can run it using the following commands in IDA:

sys.path.append('/directory/containing/export/script')

import segexport

segexport.export('/path/to/output/file)

Loading and running

Loading the exported data should now be only the matter of memory mapping the correct addresses and copying the corresponding data from the exported file. You can see it in the load() function here.

However, since we are now loading and running kernel code in userspace, there will be functions that won’t run well or that we would want to change. One example for this are the kernel allocator functions that we’ll want to replace with the system malloc.

One way of replacing these functions would be to rewrite the prolog of each function we want to replace with a jump to its replacement. However, since we will later be using TinyInst to extract code coverage, there is a simpler way. We will simply write a breakpoint instruction to each function we want to replace. Since TinyInst is (among other things) a debugger, it will catch each of these breakpoints and, from the TinyInst process, we can replace the instruction pointer with the address of the corresponding replacement function. More details on this can be found in the next section.

Besides replacing the memory allocation functions, logging functions etc., we will also need to replace all functions that interact with the hardware that we can’t access from userspace (or, in our case, that isn’t even present on the machine). In the case of AV1 parsing code in the AppleAVD kernel extension, a function called AppleAVD::sendCommandToCommandGate gets called, which I assume is meant to communicate with the decoding hardware. Thus, as a part of the harness, this function was replaced with a function that always returns 0 (success).

The final code of the AV1 harness code can be found here. It can be compiled as

clang++ loader.cpp avdharness.cpp -oloader

and might need some additional entitlements to run which can be applied with

codesign --entitlements entitlements.txt -f -s - ./loader

Note that, in the harness code, although I tried to rely on symbol names instead of hardcoded offsets wherever possible, it still contains some struct offsets. This version of the harness was based on macos 14.5 kernel, which was the most recent OS version at the time the loader was written.

Writing a custom TinyInst module

This section explains the custom TinyInst module that accompanies the loader (and is required for the correct functioning of the loader). This code doesn’t contain anything specific for a particular kernel extension and thus can be reused as is. If you are not interested in how it works or writing custom TinyInst modules, then you can skip this section.

Firstly, since we will want to extract code coverage for the purposes of fuzzing, we will base our custom module on LiteCov, the “default” TinyInst module for code coverage:

class AVDInst : public LiteCov {

 

};

Secondly, we need a way for our custom loader to communicate with the TinyInst module

  • It needs to tell TinyInst which functions in the kext should be replaced with which replacement functions.
  • It needs to tell TinyInst where the kext was loaded so that TinyInst can instrument it.

While TinyInst provides an API for function hooking that we could use here, there is also a more direct (albeit, also more low-level) way. From our loader, we will simply call a function at some hardcoded non-mapped address. This will, once again, cause an exception that TinyInst (being a debugger) will catch, read the parameters from registers, do the required action and “return” (by replacing the instruction pointer with the value inside the link register). The loader uses the hardcoded address 0x747265706C616365 to register a replacement and 0x747265706C616366 to tell TinyInst about the address range to instrument:

#define TINYINST_REGISTER_REPLACEMENT 0x747265706C616365

#define TINYINST_CUSTOM_INSTRUMENT 0x747265706C616366

We can catch those in the exception handler of our custom module

bool AVDInst::OnException(Exception *exception_record) {

  …

  if(exception_address == TINYINST_REGISTER_REPLACEMENT) {

    RegisterReplacementHook();

    return true;

  }

  if(exception_address == TINYINST_CUSTOM_INSTRUMENT) {

    InstrumentCustomRange();

    return true;

  }

  …

}

and then read the parameters and do the required action

void AVDInst::RegisterReplacementHook() {

  uint64_t original_address = GetRegister(X0);

  uint64_t replacement_address = GetRegister(X1);

  redirects[original_address] = replacement_address;

  SetRegister(ARCH_PC, GetRegister(LR));

}

void AVDInst::InstrumentCustomRange() {

  uint64_t min_address = GetRegister(X0);

  uint64_t max_address = GetRegister(X1);

  InstrumentAddressRange("__custom_range__", min_address, max_address);

  SetRegister(ARCH_PC, GetRegister(LR));

}

Where InstrumentAddressRange is a recently added TinyInst function that will instrument all code between addresses given in its parameters. “__custom_range__” is simply a name that we give to this region of memory so we can differentiate between multiple instrumented modules (if there are more than one).

Next, TinInst needs to perform the actual function replacements. As explained above, this can be done in the exception handler of our module.

  auto iter = redirects.find(exception_address);

  if(iter != redirects.end()) {

    // printf("Redirecting...\n");

    SetRegister(ARCH_PC, iter->second);

    return true;

  }

This is mostly sufficient for running the kernel extension without instrumenting it (e.g. to collect coverage). However, if we also want to instrument the extension, then the process of instrumentation involves rewriting the extension code in another location and inserting e.g. additional instructions to record coverage. The consequence of this is that our breakpoint instructions (that we inserted for the purpose of redirects) will be rewritten at different addresses. We need to make TinyInst aware of this (as a side note, TinyInst Hook API does this under the hood, but it wasn’t used in this module). We can do this in the InstrumentInstruction function which gets called for every instruction as it’s being instrumented:

InstructionResult AVDInst::InstrumentInstruction(ModuleInfo *module,

                                        Instruction& inst,

                                        size_t bb_address,

                                        size_t instruction_address)

{

  auto iter = redirects.find(instruction_address);

  if(iter != redirects.end()) {

    instrumented_redirects[assembler_->Breakpoint(module)] = iter->second;

    return INST_STOPBB;

  }

  return LiteCov::InstrumentInstruction(module, inst, bb_address, instruction_address);

}

The INST_STOPBB return value tells TinyInst to stop processing the current basic blocks. Since on breakpoints/redirects, we redirect the execution to another function, no other instructions from the same basic block ever get executed and are thus unneeded.

After this, we now know the addresses of breakpoints (and the corresponding replacements) in both instrumented and non-instrumented code. The final exception handler of our custom module looks like this:

bool AVDInst::OnException(Exception *exception_record) {

  size_t exception_address;

  if(exception_record->type == BREAKPOINT)

  {

    exception_address = (size_t)exception_record->ip;

  } else if(exception_record->type == ACCESS_VIOLATION) {

    exception_address = (size_t)exception_record->access_address;

  } else {

    return LiteCov::OnException(exception_record);

  }

  if(exception_address == TINYINST_REGISTER_REPLACEMENT) {

    RegisterReplacementHook();

    return true;

  }

  if(exception_address == TINYINST_CUSTOM_INSTRUMENT) {

    InstrumentCustomRange();

    return true;

  }

  auto iter = redirects.find(exception_address);

  if(iter != redirects.end()) {

    // printf("Redirecting...\n");

    SetRegister(ARCH_PC, iter->second);

    return true;

  }

  iter = instrumented_redirects.find(exception_address);

  if(iter != instrumented_redirects.end()) {

    // printf("Redirecting...\n");

    SetRegister(ARCH_PC, iter->second);

    return true;

  }

  return LiteCov::OnException(exception_record);

}

The entire code, with all the housekeeping functions can be found here.

Fuzzing and findings

Once our custom module is ready, we still need to make sure TinyInst and Jackalope will use this module instead of the default LiteCov module. See the appropriate patches for TinyInst and Jackalope.

Our harness should now run correctly under TinyInst, both without and with coverage instrumentation:

./Jackalope/build/TinyInst/Release/litecov -- ./loader avd_rebased.dat -f <sample>

Where avd_rebased.dat contains the kernel extension code exported from IDA. We can also add the -trace_basic_blocks flag to trace basic blocks as they are being executed (primarily useful for debugging). We can also run a fuzzing session with Jackalope like this:

./Jackalope/build/Release/fuzzer -in in -out out -t 1000 -delivery shmem -target_module loader -target_method __Z4fuzzPc -nargs 1 -iterations 5000 -persist -loop -cmp_coverage -mute_child -nthreads 6 -- ./loader avd_rebased.dat -m @@

This tells jackalope to run in persistent mode (with the function “fuzz” being looped), with sample delivery over shared memory (-delivery shmem fuzzer flag and -m being implemented in the harness code).

Fuzzing is useful not only for finding bugs in the target, but in our case also for finding bugs in the harness, e.g. finding other kernel functions we need to replace in order for the target to work correctly.

After several iterations of fixups, the harness appeared to be working correctly. However, the fuzzer also caught some crashes that appeared to have been caused by genuine issues in the AV1 parsing code. I did a root cause analysis and reported the issues to Apple. The reports can be seen in the following entries in the Project Zero issue tracker:

Unfortunately, at the time of reporting these issues I still didn’t have access to a machine with AV1 decoding capabilities. Thus, instead of full end-to-end PoCs, the issues were reported in the form of a full root cause analysis and a binary stream that causes a crash when used as a parameter to a particular decoding function. Eventually, we did get a Macbook with a M3 chip that supports AV1 hardware decoding and tried to reproduce the reported issues. Unsurprisingly, all three issues reproduced exactly the same on the real hardware as in the userspace harness.

Conclusion

The goal of this project was to create userspace kernel extension fuzzing tooling that was as simple as possible, and at least one of the reasons for this simplicity was that it could be easily adapted to other pieces of kernel code. The process is versatile enough that it allowed us to fuzz AV1 parsing code that normally requires hardware we didn’t even have at the time. While the three issues found during this research are not critical, they demonstrate the correctness of the approach and the potential for finding other issues.