Sunday, December 15, 2024

The Qualcomm DSP Driver - Unexpectedly Excavating an Exploit

Posted by Seth Jenkins, Google Project Zero

This blog post provides a technical analysis of exploit artifacts provided to us by Google's Threat Analysis Group (TAG) from Amnesty International. Amnesty’s report on these exploits is available here. Thanks to both Amnesty International and Google's Threat Analysis Group for providing the artifacts and collaborating on the subsequent technical analysis!

Introduction

Earlier this year, Google's TAG received some kernel panic logs generated by an In-the-Wild (ITW) exploit. Those logs kicked off a bug hunt that led to the discovery of 6 vulnerabilities in one Qualcomm driver over the course of 2.5 months, including one issue that TAG reported as ITW. This blog post covers the details of the original artifacts, each of the bugs discovered, and the hypothesized ITW exploit strategy gleaned from the logs.

Artifacts

Usually when successfully reverse-engineering an ITW exploit, Project Zero/TAG have had access to the exploit sample itself, making determining what vulnerability was exploited primarily a matter of time and effort. However, in this particular case, we received several kernel panic logs but unfortunately not the exploit sample. This meant we could not directly reproduce crashes or reverse engineer what bug was being exploited.

Accurately determining what vulnerability an exploit uses working only off of crash logs and without the exploit itself can range in difficulty from highly plausible to impossible. I decided to give it a try and see what I could learn. Out of the 6 panics we received, 4 panics in particular contained potentially useful information:

Log 1:

[   47.223480] adsprpc: fastrpc_init_process: untrusted app trying to attach to privileged DSP PD

[   47.254494] adsprpc: mapping not found to unmap fd 0xffffffff, va 0xffffffffffffffff, len 0xffffffff

[   47.254512] adsprpc: falcon: fastrpc_internal_mmap: ERROR: adding user allocated pages is not supported

[   47.261488] adsprpc: mapping not found to unmap fd 0xa, va 0x0, len 0x0

...

[   50.865579] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000

[   50.865586] Mem abort info:

[   50.865590]   ESR = 0x96000006

[   50.865593]   Exception class = DABT (current EL), IL = 32 bits

[   50.865597]   SET = 0, FnV = 0

[   50.865600]   EA = 0, S1PTW = 0

[   50.865603] Data abort info:

[   50.865606]   ISV = 0, ISS = 0x00000006

[   50.865609]   CM = 0, WnR = 0

[   50.865614] user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000f66703d3

[   50.865617] [0000000000000000] pgd=0000000213147003, pud=0000000213147003, pmd=0000000000000000

[   50.865624] Internal error: Oops: 96000006 [#1] PREEMPT SMP

...

[   50.865649] Process falcon (pid: 8909, stack limit = 0x000000000e91af69)

[   50.865654] CPU: 5 PID: 8909 Comm: falcon Tainted: G S      W  O      4.19.157-perf-g8779875ad741 #1

[   50.865657] Hardware name: Qualcomm Technologies, Inc. xiaomi apollo (DT)

[   50.865661] pstate: 00400005 (nzcv daif +PAN -UAO)

[   50.865669] pc : __list_del_entry_valid+0x34/0xd0

[   50.865672] lr : dma_buf_detach+0x34/0xa0

[   50.865675] sp : ffffff802c7bb990

...

[   50.865735] Call trace:

[   50.865739]  __list_del_entry_valid+0x34/0xd0

[   50.865742]  dma_buf_detach+0x34/0xa0

[   50.865746]  fastrpc_mmap_free+0x3e8/0x4d0

[   50.865749]  fastrpc_file_free+0x1a8/0x2e0

[   50.865753]  fastrpc_device_release+0x50/0x68

[   50.865757]  __fput+0xb8/0x1b0

[   50.865762]  ____fput+0xc/0x18

[   50.865764]  task_work_run+0x8c/0xb0

[   50.865767]  do_exit+0x3fc/0xa10

[   50.865770]  do_group_exit+0x8c/0xa0

[   50.865773]  get_signal+0x7c8/0x958

[   50.865778]  do_notify_resume+0x148/0x23e8

[   50.865781]  work_pending+0x8/0x10

[   50.865785] Code: f9400669 91040042 eb02013f 54000260 (f9400122)

[   50.865789] ---[ end trace 42c589b65f43d4ee ]---

[   50.865802] Kernel panic - not syncing: Fatal exception

We see right away from the first panic that the exploit appears to be targeting a driver called adsprpc. We also see from the stacktrace that the crash is happening when freeing a fastrpc_mmap struct - so it seems likely this is a heap exploit of some sort, and that a fastrpc_mmap struct is potentially involved.

Log 2:

[   37.450199] adsprpc: fastrpc_init_process: untrusted app trying to attach to privileged DSP PD

[   37.482741] adsprpc: mapping not found to unmap fd 0xffffffff, va 0xffffffffffffffff, len 0xffffffff

[   37.482759] adsprpc: falcon: fastrpc_internal_mmap: ERROR: adding user allocated pages is not supported

[   37.486210] adsprpc: mapping not found to unmap fd 0xa, va 0x0, len 0x0

...

[   40.917577] adsprpc: ERROR:fastrpc_mmap_free, Invalid channel id: 1702834303, err:-44

...

[   41.970037] adsprpc: ERROR:fastrpc_mmap_free, Invalid channel id: 1702834303, err:-44

...

[   51.052781] adsprpc: ERROR:fastrpc_mmap_free, Invalid channel id: 1702834303, err:-44

...

[   73.964765] adsprpc: ERROR:fastrpc_mmap_free, Invalid channel id: 1702834303, err:-44

...

[   83.030394] adsprpc: ERROR:fastrpc_mmap_free, Invalid channel id: 1702834303, err:-44

...

[   86.358103] Unable to handle kernel paging request at virtual address 0035fb968c5d536d

[   86.358118] Mem abort info:

[   86.358122]   ESR = 0x96000044

[   86.358127]   Exception class = DABT (current EL), IL = 32 bits

[   86.358131]   SET = 0, FnV = 0

[   86.358135]   EA = 0, S1PTW = 0

[   86.358139] Data abort info:

[   86.358143]   ISV = 0, ISS = 0x00000044

[   86.358147]   CM = 0, WnR = 1

[   86.358151] [0035fb968c5d536d] address between user and kernel address ranges

[   86.358159] Internal error: Oops: 96000044 [#1] PREEMPT SMP

...

[   86.358221] Process falcon (pid: 7053, stack limit = 0x00000000a7dfa97f)

[   86.358230] CPU: 0 PID: 7053 Comm: falcon Tainted: G S         O      4.19.157-perf-g8779875ad741 #1

[   86.358235] Hardware name: Qualcomm Technologies, Inc. xiaomi apollo (DT)

[   86.358241] pstate: 60400005 (nZCv daif +PAN -UAO)

[   86.358259] pc : fastrpc_file_free+0x1c4/0x2e0

[   86.358264] lr : fastrpc_file_free+0x198/0x2e0

[   86.358268] sp : ffffff80264d3a50

...

[   86.358352] Call trace:

[   86.358359]  fastrpc_file_free+0x1c4/0x2e0

[   86.358364]  fastrpc_device_release+0x50/0x68

[   86.358374]  __fput+0xb8/0x1b0

[   86.358380]  ____fput+0xc/0x18

[   86.358387]  task_work_run+0x8c/0xb0

[   86.358394]  do_exit+0x3fc/0xa10

[   86.358399]  do_group_exit+0x8c/0xa0

[   86.358405]  get_signal+0x7c8/0x958

[   86.358412]  do_notify_resume+0x148/0x23e8

[   86.358418]  work_pending+0x8/0x10

[   86.358424] Code: b4ffff68 f9400009 f9000109 b4fffee9 (f9000528)

[   86.358430] ---[ end trace 9b01c55ca2d0bfea ]---

[   86.358452] Kernel panic - not syncing: Fatal exception

Here’s another crash in the adsprpc driver, this time associated with a fastrpc_file struct which is associated with a struct file which itself is the backing object referenced by a file descriptor. We also see that the exploit appears to have gotten farther in the exploit process this time and was making multiple calls to fastrpc_mmap_free. Notably the channel id is set to this very large value: 1702834303. Channel id’s can’t usually be set this high. While the maximum value varies from version to version, valid channel ids are in the range from 0 to about 6, so it’s clear that there is somehow memory corruption of the channel id (cid). It is also notable that the channel id value is set to a Unix epoch timestamp value - something Donncha of Amnesty noticed in the course of investigation. 1702834303 represents the date Sunday, December 17, 2023 5:31:43 PM which is quite close to when the exploit was thrown…why could this be?

Log 3:

[ 2244.639158] adsprpc: ERROR: fastrpc_internal_mmap: user application falcon trying to map without initialization

...

[ 2244.641272] adsprpc: falcon: fastrpc_init_process: ERROR: donated memory allocated in userspace

[ 2244.683779] adsprpc: mapping not found to unmap fd 0xffffffff, va 0xffffffffffffffff, len 0xffffffff

[ 2244.683794] adsprpc: falcon: fastrpc_internal_mmap: ERROR: adding user allocated pages is not supported

[ 2244.689633] adsprpc: mapping not found to unmap fd 0x9, va 0x0, len 0x0

[ 2247.159424] Unable to handle kernel paging request at virtual address 006f7778a9cf5b88

[ 2247.159442] Mem abort info:

[ 2247.159446]   ESR = 0x96000004

[ 2247.159453]   Exception class = DABT (current EL), IL = 32 bits

[ 2247.159458]   SET = 0, FnV = 0

[ 2247.159462]   EA = 0, S1PTW = 0

[ 2247.159468] Data abort info:

[ 2247.159472]   ISV = 0, ISS = 0x00000004

[ 2247.159476]   CM = 0, WnR = 0

[ 2247.159481] [006f7778a9cf5b88] address between user and kernel address ranges

[ 2247.159489] Internal error: Oops: 96000004 [#1] PREEMPT SMP

...

[ 2247.159572] Process falcon (pid: 17512, stack limit = 0x00000000c911fea5)

[ 2247.159582] CPU: 0 PID: 17512 Comm: falcon Tainted: G S      W  O      4.19.157-perf-g8779875ad741 #1

[ 2247.159587] Hardware name: Qualcomm Technologies, Inc. xiaomi apollo (DT)

[ 2247.159595] pstate: 60400005 (nZCv daif +PAN -UAO)

[ 2247.159614] pc : __kmalloc+0x1c4/0x398

[ 2247.159619] lr : __kmalloc+0x60/0x398

[ 2247.159623] sp : ffffff8029173b80

...

[ 2247.159719] Call trace:

[ 2247.159727]  __kmalloc+0x1c4/0x398

[ 2247.159740]  inotify_handle_event+0xc8/0x1c8

[ 2247.159746]  fsnotify+0x270/0x378

[ 2247.159753]  __fsnotify_parent+0xdc/0x138

[ 2247.159763]  notify_change2+0x314/0x348

[ 2247.159771]  do_sys_ftruncate+0x190/0x1c0

[ 2247.159776]  __arm64_sys_ftruncate+0x1c/0x28

[ 2247.159786]  el0_svc_common+0x98/0x160

[ 2247.159792]  el0_svc_handler+0x68/0x80

[ 2247.159800]  el0_svc+0x8/0xc

[ 2247.159808] Code: b4000a77 b940230a f940bf0b 8b0a02ea (f940014c)

[ 2247.159815] ---[ end trace 3729c600fbf1ba28 ]---

[ 2247.159842] Kernel panic - not syncing: Fatal exception

Again we see adsprpc driver logs, but it culminates in a crash within the context of the exploit process in a different part of the code entirely - this time from the inotify subsystem. We will revisit the importance of this crash later.

Log 4:

[   67.167510] adsprpc: fastrpc_init_process: untrusted app trying to attach to privileged DSP PD

[   67.202061] adsprpc: mapping not found to unmap fd 0xffffffff, va 0xffffffffffffffff, len 0xffffffff

[   67.202084] adsprpc: falcon: fastrpc_internal_mmap: ERROR: adding user allocated pages is not supported

[   67.207916] adsprpc: mapping not found to unmap fd 0xa, va 0x0, len 0x0

[   69.577152] adsprpc: ERROR:fastrpc_mmap_free, Invalid channel id: 1702832054, err:-44

[   70.621863] adsprpc: ERROR:fastrpc_mmap_free, Invalid channel id: 1702832054, err:-44

...

[   79.689300] adsprpc: ERROR:fastrpc_mmap_free, Invalid channel id: 1702832054, err:-44

...

[   97.574406] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000009

[   97.574435] Mem abort info:

[   97.574445]   ESR = 0x96000006

[   97.574457]   Exception class = DABT (current EL), IL = 32 bits

[   97.574467]   SET = 0, FnV = 0

[   97.574476]   EA = 0, S1PTW = 0

[   97.574485] Data abort info:

[   97.574495]   ISV = 0, ISS = 0x00000006

[   97.574504]   CM = 0, WnR = 0

[   97.574522] user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000ad40fd5a

[   97.574532] [0000000000000009] pgd=00000001cadcb003, pud=00000001cadcb003, pmd=0000000000000000

[   97.574554] Internal error: Oops: 96000006 [#1] PREEMPT SMP

...

[   97.574680] Process falcon (pid: 10050, stack limit = 0x00000000eac9e565)

[   97.574700] CPU: 0 PID: 10050 Comm: falcon Tainted: G S         O      4.19.157-perf-g8779875ad741 #1

[   97.574711] Hardware name: Qualcomm Technologies, Inc. xiaomi apollo (DT)

[   97.574726] pstate: 00400005 (nzcv daif +PAN -UAO)

[   97.574756] pc : pipe_read+0xac/0x308

[   97.574769] lr : pipe_read+0x4c/0x308

[   97.574778] sp : ffffff802e43bc80

...

[   97.574982] Call trace:

[   97.574996]  pipe_read+0xac/0x308

[   97.575014]  __vfs_read+0xf8/0x140

[   97.575027]  vfs_read+0xb8/0x150

[   97.575039]  ksys_read+0x6c/0xd0

[   97.575053]  __arm64_sys_read+0x18/0x20

[   97.575071]  el0_svc_common+0x98/0x160

[   97.575083]  el0_svc_handler+0x68/0x80

[   97.575097]  el0_svc+0x8/0xc

[   97.575114] Code: aa1c03fb aa1c03e1 b840ce68 f8410f69 (f9400529)

[   97.575127] ---[ end trace 72c08623f6dedcd7 ]---

[   97.575174] Kernel panic - not syncing: Fatal exception

We see here a fourth type of crash, this time in the pipe subsystem. Pipe buffers are often used as a spray object for heap exploitation, and it wouldn’t be terribly surprising if that were the case here.

There are several valuable pieces of information to glean from the logs - the most meaningful being the usage of this adsprpc driver. We also see log lines from several adsprpc functions:

  • fastrpc_init_process
  • fastrpc_internal_munmap_fd
  • fastrpc_internal_mmap
  • fastrpc_mmap_free

It is likely the bug used by the attacker resides somewhere in the relationships between these functions, but exactly where is not clear from the logs alone. The functions that are executed by the exploit only serve to complicate the investigation, as the exploit seems to constantly hit very early bailouts that shouldn’t cause any change in kernel state whatsoever.

Upon additional investigation (by Jann Horn in particular!) it became clear that this driver is accessible from a spectrum of unprivileged contexts. untrusted_app does not have the ability to directly open the driver device file, but at least on some devices it can obtain limited access by receiving a file descriptor to the device file from the dspservice process through the IDspService HAL interface, which is reachable through hwbinder.

As we’ve seen before, third-party Android drivers are appealingly buggy attack surfaces, regularly containing a reservoir of potential vulnerabilities for attackers. While it wasn’t immediately clear what vulnerability the attackers had exploited, it was clear that performing a more thorough audit of this driver was warranted.

The adsprpc Driver

The Application Digital Signal Processor Remote Procedure Call driver (or adsprpc for short) is primarily used for offloading multimedia processing to a more efficient DSP co-processor core. The driver is primarily accessed through the /dev/adsprpc-smd character device file although historically it could be reached via a variety of device files including cdsprpc-smd and mdsprpc-smd. The driver’s architecture is helpfully thoroughly described in the kernel documentation. Through this driver, co-processor routines are exposed to the application processor userland (including untrusted_app processes) via an RPC interface, providing an efficient abstraction methodology by which multimedia processing can be offloaded to the specialized hardware in the SoC. Through the use of DMA buffers that are mapped directly onto the DSP core, adsprpc looks to minimize the amount of data copied across the processor boundary. This featureset is necessarily complex, and that makes it a ripe target for in-depth security research.

The Bughunt Begins

Having given up on discovering the bug exploited in the ITW logs directly, it was time to start a broader code review process. This turned out to be a very productive research decision. Jann found the first bug quite quickly, and over the course of the next several months, I found 5 more. I’ve described each of the bugs below.

CVE-2024-38402: refcount leak leading to UAF in fastrpc_get_process_gids

The first discovered vulnerability in the driver is a refcount leak of the group_info struct associated with the task that leads to a UAF. In the function fastrpc_get_process_gids, get_current_groups is called which increments a refcount on the group_info struct, but that refcount is never dropped. Furthermore, this refcount is a non-saturating refcount which makes it possible to overflow if you execute fastrpc_get_process_gids approximately 2^32 times. This is a difficult bug to exploit in practice, taking at least 14 hours to trigger, but it’s nevertheless memory corruption with all the associated consequences. An example crash from this issue is below:

[77306.174599] [7:           adbd: 5455] BUG: KFENCE: invalid read in groups_to_user+0x34/0x1a4 

 

[77306.174606] [7:           adbd: 5455] Invalid read at 0xffffff89572a0000: 

[77306.174607] [7:           adbd: 5455]  groups_to_user+0x34/0x1a4 

[77306.174609] [7:           adbd: 5455]  invoke_syscall+0x58/0x13c 

[77306.174612] [7:           adbd: 5455]  el0_svc_common+0xb4/0xf0 

[77306.174614] [7:           adbd: 5455]  do_el0_svc+0x24/0x90 

[77306.174615] [7:           adbd: 5455]  el0_svc+0x20/0x7c 

[77306.174617] [7:           adbd: 5455]  el0t_64_sync_handler+0x84/0xe4 

[77306.174618] [7:           adbd: 5455]  el0t_64_sync+0x1b8/0x1bc 

[77306.174620] [7:           adbd: 5455]   

[77306.174621] [7:           adbd: 5455] CPU: 7 PID: 5455 Comm: adbd Tainted: G S      W  OE     5.15.123-android13-8-28577312-abS911BXXU3CXD3 #1 

[77306.174623] [7:           adbd: 5455] Hardware name: Samsung DM1Q PROJECT (board-id,13) (DT) 

[77306.174624] [7:           adbd: 5455] pstate: 22400005 (nzCv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--) 

[77306.174626] [7:           adbd: 5455] pc : groups_to_user+0x34/0x1a4 

[77306.174628] [7:           adbd: 5455] lr : __arm64_sys_getgroups+0x4c/0x6c 

[77306.174629] [7:           adbd: 5455] sp : ffffffc02a073e00 

[77306.174630] [7:           adbd: 5455] x29: ffffffc02a073e00 x28: ffffff8868030040 x27: 0000000000000000 

[77306.174633] [7:           adbd: 5455] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 

[77306.174635] [7:           adbd: 5455] x23: 0000000060001000 x22: 00000077219a4f2c x21: ffffff8868030040 

[77306.174637] [7:           adbd: 5455] x20: ffffffc0081bd46c x19: 000000006b6b6b6b x18: ffffffc016089000 

[77306.174638] [7:           adbd: 5455] x17: 000000000000fffe x16: b4000074e392f638 x15: ffffff895729fff8 

[77306.174640] [7:           adbd: 5455] x14: 00000000524e68f8 x13: ffffff8868030040 x12: ffffffc00ad82000 

[77306.174641] [7:           adbd: 5455] x11: 0000007fffffffff x10: ffffffc00aa93000 x9 : 0000000014939a3e 

[77306.174643] [7:           adbd: 5455] x8 : 000000006b6b6b6b x7 : 0000000000000000 x6 : 0000000000000000 

[77306.174645] [7:           adbd: 5455] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000 

[77306.174646] [7:           adbd: 5455] x2 : 0000000000000040 x1 : ffffff8904db9700 x0 : b400007491448d40 

[77306.174648] [7:           adbd: 5455] Call trace: 

[77306.174648] [7:           adbd: 5455]  groups_to_user+0x34/0x1a4 

[77306.174650] [7:           adbd: 5455]  invoke_syscall+0x58/0x13c 

[77306.174651] [7:           adbd: 5455]  el0_svc_common+0xb4/0xf0 

[77306.174653] [7:           adbd: 5455]  do_el0_svc+0x24/0x90 

[77306.174654] [7:           adbd: 5455]  el0_svc+0x20/0x7c 

[77306.174655] [7:           adbd: 5455]  el0t_64_sync_handler+0x84/0xe4 

[77306.174656] [7:           adbd: 5455]  el0t_64_sync+0x1b8/0x1bc 

CVE-2024-21455: is_compat flag leads to access of userland provided addresses as kernel pointers

In order to support 32-bit userland processes, 64-bit kernels contain a “compatibility layer” that ioctls can support. This layer is responsible for marshaling over 32-bit structs into their 64-bit equivalents which involves upcasting 32-bit userland pointers into 64-bit pointers. The adsprpc driver handles this case in the adsprpc_compat.c file. It allocates kernel memory, copies and converts the 32-bit struct into that kernel memory as a 64-bit struct, and then calls the 64-bit ioctl interface. The 64-bit ioctl interface thus needs to handle calls coming from both the 32-bit kernel compatibility layer and from 64-bit userland. In order to provide this support, the 32-bit compatibility layer indicates to the broader adsprpc driver that the 32-bit compatibility layer is in use by setting a flag is_compat in the file-descriptor-bound fl struct.

long compat_fastrpc_device_ioctl(struct file *filp, unsigned int cmd,

                                unsigned long arg)

{

        int err = 0;

        struct fastrpc_file *fl = (struct fastrpc_file *)filp->private_data;

        if (!filp->f_op || !filp->f_op->unlocked_ioctl)

                return -ENOTTY;

        fl->is_compat = true;

...

}

Later on, that is_compat flag is used in calls to K_COPY_FROM_USER to make decisions about whether to use memmove (32-bit compatibility layer or other kernel invocation) or copy_from_user.

#define K_COPY_FROM_USER(err, kernel, dst, src, size) \

        do {\

                if (!(kernel))\

                        err = copy_from_user((dst),\

                        (void const __user *)(src),\

                        (size));\

                else\

                        memmove((dst), (src), (size));\

        } while (0)

...
int
 fastrpc_internal_invoke2(struct fastrpc_file *fl,

                                struct fastrpc_ioctl_invoke2 *inv2)

{

switch (inv2->req) {

        case FASTRPC_INVOKE2_ASYNC:

                ...

                        K_COPY_FROM_USER(err, fl->is_compat, &p.inv3, (void*)inv2->invparam, sizeof(struct fastrpc_ioctl_invoke_async_no_perf));

...

}

However, this flag is set at a relatively global level in that any other ioctl calls on the same file descriptor will see that this flag is set. Furthermore, once the flag is set, it is never UNset. Consider the following scenario:

  1. A malicious 64-bit process A opens the adsprpc-smd file, creating a new adsprpc file descriptor
  2. Process A forks and creates a new 32-bit process B (A and B share the adsprpc fd/fl)
  3. Process B invokes the 32-bit ioctl interface (thusly setting the is_compat flag) and exits
  4. Process A invokes the 64-bit ioctl interface

In this circumstance, the driver incorrectly thinks that the request is coming from the 32-bit compatibility layer (since is_compat is set) and that it should access the struct as if it contained kernel pointers while it is in fact a request coming from 64-bit userland containing untrusted userland-provided pointers (which could in a malicious case be kernel pointers!). The kernel will subsequently use an unsafe memmove to access these pointers, leading to userland controlled reads of kernel addresses.

[49468.514358] Unable to handle kernel paging request at virtual address ffffffff41414141

[49468.514397] PC Code: d65f03c0 d503201f (a9401c26) a9412428

[49468.514407] LR Code: 340003a8 957474dd (f9401be8) f90023e8

[49468.514413] Mem abort info:

[49468.514418] ESR = 0x96000005

[49468.514426] EC = 0x25: DABT (current EL), IL = 32 bits

[49468.514433] SET = 0, FnV = 0

[49468.514440] EA = 0, S1PTW = 0

[49468.514445] FSC = 0x05: level 1 translation fault

[49468.514452] Data abort info:

[49468.514456] ISV = 0, ISS = 0x00000005

[49468.514463] CM = 0, WnR = 0

[49468.514469] swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000aa728000

[49468.514479] [ffffffff41414141] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000

[49468.514502] Internal error: Oops: 96000005 [#1] PREEMPT SMP

[49468.514780] sec_arm64_ap_context:sec_arm64_ap_context_on_die() context saved (CPU:6)

[49468.514788] Modules linked in: [...]

[49468.516301] CPU: 6 PID: 17448 Comm: poc_compat_pare Tainted: G S      W  OE     5.15.123-android13-8-28577312-abS911BXXU3CXD3 #1

[49468.516312] Hardware name: Samsung DM1Q PROJECT (board-id,13) (DT)

[49468.516317] pstate: 22400005 (nzCv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--)

[49468.516326] pc : __memcpy+0x90/0x250

[49468.516339] lr : fastrpc_internal_invoke2+0x308/0x408 [frpc_adsprpc]

[49468.516471] sp : ffffffc02fe33ba0

[49468.516475] x29: ffffffc02fe33cb0 x28: ffffff894341bb80 x27: 0000000000000000

[49468.516489] x26: 0000000000000000 x25: 0000000000000000 x24: ffffff891fe6c5c0

[49468.516499] x23: 00000000ffffffe7 x22: ffffff8939eff010 x21: 00000000c0185212

[49468.516510] x20: 0000007fe5890e40 x19: ffffff8939eff000 x18: ffffffc017b37010

[49468.516520] x17: 0000000000000000 x16: 0000000000000000 x15: 0000007fe5890e40

[49468.516530] x14: ffffff80011f6480 x13: 0000000000000000 x12: ffffff8034b27ce8

[49468.516540] x11: 0000000000000010 x10: ffffffc002443654 x9 : 0000000000000008

[49468.516549] x8 : 0000000000000001 x7 : 0000000000000001 x6 : ffffffc02fe33d30

[49468.516559] x5 : ffffffc02fe33bd8 x4 : ffffffff41414171 x3 : 0000000000000008

[49468.516568] x2 : 0000000000000030 x1 : ffffffff41414141 x0 : ffffffc02fe33ba8

[49468.516577] Call trace:

[49468.516582] __memcpy+0x90/0x250

[49468.516593] fastrpc_device_ioctl+0x1b0/0x92c [frpc_adsprpc]

[49468.516716] __arm64_sys_ioctl+0x120/0x170

[49468.516734] invoke_syscall+0x58/0x13c

[49468.516745] el0_svc_common+0xb4/0xf0

[49468.516753] do_el0_svc+0x24/0x90

[49468.516760] el0_svc+0x20/0x7c

[49468.516771] el0t_64_sync_handler+0x84/0xe4

[49468.516778] el0t_64_sync+0x1b8/0x1bc

[49468.516790] Code: 382e6808 381ff0aa d65f03c0 d503201f (a9401c26)

[49468.516799] ---[ end trace 9c87e0f40bf8f469 ]---

This could plausibly be elevated directly into an arbitrary kernel read primitive.

Understanding the fastrpc_mmap struct

The adsprpc driver maintains some internal bookkeeping on what DMA buffers are mapped onto the co-processor via fastrpc_mmap structs. These structs are allocated and initialized in fastrpc_mmap_create and contain several characteristics that substantially complicate management of these objects. They can be on either an fl (associated with a struct file) local or global linked list, depending on the flags used on creation. They have two separate refcounts refs and ctx_refs and they can be referenced from multiple places at once, including a context (an object that tracks data associated with a single RPC call), the global or local map lists, and of course transient stack-based references when being created or destroyed. A reference on an existing map is taken when fastrpc_mmap_create is called with a set of parameters that are fulfilled by an existing mapping.

fastrpc_mmap objects can be created via several different codepaths. This includes during context initialization, by accessing two different dedicated ioctl handlers, during DSP initialization and process creation, and even by a request from the DSP itself. fastrpc_mmap objects can be freed through a reciprocal set of codepaths such as context or struct file teardown, and via three different dedicated unmapping ioctls. We will examine several of these creation and destruction codepaths as I discuss three discovered bugs involving use-after-free of a fastrpc_mmap_struct.

CVE-2024-33060: UAF race of global maps in fastrpc_mmap_create (and epilogue functions)

It is important to have a good understanding of the locking utilized for protecting these map lists from races in order to understand the first bug found:

int fastrpc_internal_mem_map(struct fastrpc_file *fl,

                                struct fastrpc_ioctl_mem_map *ud)

{

        int err = 0;

        struct fastrpc_mmap *map = NULL;

        mutex_lock(&fl->internal_map_mutex);

        ...

        mutex_lock(&fl->map_mutex);

        VERIFY(err, !(err = fastrpc_mmap_create(fl, ud->m.fd, NULL, ud->m.attrs,

                        ud->m.vaddrin, ud->m.length,

                         ud->m.flags, &map)));

        mutex_unlock(&fl->map_mutex);

        if (err)

                goto bail;

...
        
//[1] map may already be globally visible

VERIFY(err, !(err = fastrpc_mem_map_to_dsp(fl, ud->m.fd, ud->m.offset,

                ud->m.flags, map->va, map->phys, map->size, &map->raddr)));

if (err)

        goto bail;

ud->m.vaddrout = map->raddr;

bail:

        if (err) {

                if (map) {

                        mutex_lock(&fl->map_mutex);

                        fastrpc_mmap_free(map, 0);

                        mutex_unlock(&fl->map_mutex);

                }

        }

        mutex_unlock(&fl->internal_map_mutex);

        return err;

}

Two mutexes are held here, one of which is held throughout the lifetime of the function including in the error condition bailout: fl->map_mutex and fl->internal_map_mutex. Both of these mutexes are bound to the fl struct which is itself bound to the struct file associated with a file descriptor. These mutexes prevent concurrency for e.g. multiple fastrpc_internal_mem_map calls on the same file descriptor, but do not prevent concurrency with other fl structs (e.g. from a second open’d adsprpc-smd file descriptor) being utilized in the same ioctls. As these ioctls are often used to administer fl struct local maps, global concurrency is often okay. However fastrpc_mmap_create and fastrpc_internal_mem_map can create global maps too. And for global maps that are added to global structures, global mutexing is only briefly taken in fastrpc_internal_mem_map -> fastrpc_mmap_create -> fastrpc_mmap_add:

static void fastrpc_mmap_add(struct fastrpc_mmap *map)

{

        if (map->flags == ADSP_MMAP_HEAP_ADDR ||

                                map->flags == ADSP_MMAP_REMOTE_HEAP_ADDR) {

                struct fastrpc_apps *me = &gfa; //gfa is a global struct

                unsigned long irq_flags = 0;

                spin_lock_irqsave(&me->hlock, irq_flags); //Taken here

                hlist_add_head(&map->hn, &me->maps);

                spin_unlock_irqrestore(&me->hlock, irq_flags); //Dropped here

        } else {

                struct fastrpc_file *fl = map->fl;

                hlist_add_head(&map->hn, &fl->maps);

        }

}

This means that the global mutexing architecture is not enough to prevent two concurrent invocations of the fastrpc mapping ioctl calls, even if a global map is being created. This itself is not a bug, but even after insertion of a global map onto the global map list, fastrpc_internal_mem_map continues to access the map even though it should really be considering its reference to the global map “consumed”. Crucially, if a global map is created and added to the global list, it is immediately visible to concurrent callers - a thread calling fastrpc_internal_munmap with a different adsprpc fd / fl struct could destroy the global map while fastrpc_internal_mem_map continues to use it (see [1] in the above code sample from fastrpc_internal_mem_map). This is an example of continuing to use a transient reference after transferring that reference to a data structure where the lifetimes of the residing objects are userland-managed  - another example would be using  a struct file object after installing the only reference in a file descriptor table (via fd_install) at which point userland can drop the reference using the close(2) syscall.

Triggering this bug generates the following kernel panic with PAGE_POISON enabled:

[ 2890.558370] [0:            poc:22189] Unable to handle kernel paging request at virtual address 006b6b6b6b6b6b83

[ 2890.558411] [0:            poc:22189] PC Code: 95ca6fb3 aa1703e0 2a1f03e1 97ffdbcc 2a1f03f6 14000008 f9400ae8 (f8418d09) f90002e9 b4000049 f9000537 f9000117 f90006e8 aa1403e0 95ca66a2 aa1303e0 95ca66a0 d5384108 f942f108 f94007e9

[ 2890.558618] [0:            poc:22189] LR Code: 94000075 2a0003f6 aa1403e0 95ca66ed f94003f7 340006f6 b4000937 aa1403e0 95ca6feb (b94026e8) 7100211f 54000060 7100111f 54000721 b00000f8 91038318 91008315 aa1503e0 95ca97d3 f9400308

[ 2890.558633] [0:            poc:22189] Mem abort info:

[ 2890.558641] [0:            poc:22189]   ESR = 0x96000004

[ 2890.558650] [0:            poc:22189]   EC = 0x25: DABT (current EL), IL = 32 bits

[ 2890.558661] [0:            poc:22189]   SET = 0, FnV = 0

[ 2890.558670] [0:            poc:22189]   EA = 0, S1PTW = 0

[ 2890.558678] [0:            poc:22189]   FSC = 0x04: level 0 translation fault

[ 2890.558688] [0:            poc:22189] Data abort info:

[ 2890.558696] [0:            poc:22189]   ISV = 0, ISS = 0x00000004

[ 2890.558704] [0:            poc:22189]   CM = 0, WnR = 0

[ 2890.558713] [0:            poc:22189] [006b6b6b6b6b6b83] address between user and kernel address ranges

[ 2890.558727] [0:            poc:22189] Internal error: Oops: 96000004 [#1] PREEMPT SMP

[ 2890.559162] [0:            poc:22189] sec_arm64_ap_context:sec_arm64_ap_context_on_die() context saved (CPU:0)

...

[ 2890.560996] [0:            poc:22189] CPU: 0 PID: 22189 Comm: poc Tainted: G S      W  OE     5.15.123-android13-8-28577312-abS911BXXU3CXD3 #1

[ 2890.561007] [0:            poc:22189] Hardware name: Samsung DM1Q PROJECT (board-id,13) (DT)

[ 2890.561014] [0:            poc:22189] pstate: 22400005 (nzCv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--)

[ 2890.561024] [0:            poc:22189] pc : fastrpc_internal_munmap+0x1ac/0x264 [frpc_adsprpc]

[ 2890.561202] [0:            poc:22189] lr : fastrpc_internal_munmap+0xb4/0x264 [frpc_adsprpc]

[ 2890.561376] [0:            poc:22189] sp : ffffffc025ee3cc0

[ 2890.561382] [0:            poc:22189] x29: ffffffc025ee3cd0 x28: ffffff88bf4fbb80 x27: 0000000000000000

[ 2890.561397] [0:            poc:22189] x26: 0000000000000000 x25: 0000000000000000 x24: ffffff8922ae4301

[ 2890.561411] [0:            poc:22189] x23: ffffff803bb30900 x22: 0000000080000448 x21: ffffff8928fb5800

[ 2890.561424] [0:            poc:22189] x20: ffffff8928fb5910 x19: ffffff8928fb5940 x18: ffffffc00b492010

[ 2890.561437] [0:            poc:22189] x17: 00000000000003e7 x16: 0000000000007e00 x15: 0000000000000600

[ 2890.561450] [0:            poc:22189] x14: ffffff891cc57e00 x13: dee89d8ccc1e57a7 x12: 088000400811164c

[ 2890.561463] [0:            poc:22189] x11: ffffff891cc51a00 x10: ffffff88bf4fbb80 x9 : 0000000000000000

[ 2890.561476] [0:            poc:22189] x8 : 6b6b6b6b6b6b6b6b x7 : bbbbbbbbbbbbbbbb x6 : 00000000000000c0

[ 2890.561489] [0:            poc:22189] x5 : 0000000000150009 x4 : ffffff891cc57400 x3 : 000000000015000a

[ 2890.561502] [0:            poc:22189] x2 : ffffff88bf4fbb80 x1 : 0000000000000000 x0 : 0000000000000000

[ 2890.561516] [0:            poc:22189] Call trace:

[ 2890.561523] [0:            poc:22189]  fastrpc_internal_munmap+0x1ac/0x264 [frpc_adsprpc]

[ 2890.561696] [0:            poc:22189]  fastrpc_device_ioctl+0x7e8/0x92c [frpc_adsprpc]

[ 2890.561867] [0:            poc:22189]  __arm64_sys_ioctl+0x120/0x170

[ 2890.561886] [0:            poc:22189]  invoke_syscall+0x58/0x13c

[ 2890.561899] [0:            poc:22189]  el0_svc_common+0xb4/0xf0

[ 2890.561908] [0:            poc:22189]  do_el0_svc+0x24/0x90

[ 2890.561917] [0:            poc:22189]  el0_svc+0x20/0x7c

[ 2890.561929] [0:            poc:22189]  el0t_64_sync_handler+0x84/0xe4

[ 2890.561937] [0:            poc:22189]  el0t_64_sync+0x1b8/0x1bc

[ 2890.561951] [0:            poc:22189] Code: 97ffdbcc 2a1f03f6 14000008 f9400ae8 (f8418d09)

[ 2890.561967] [0:            poc:22189] ---[ end trace af6bd4fc06724258 ]---

[ 2890.561978] [0:            poc:22189] Kernel panic - not syncing: Oops: Fatal exception

PZ Issue 42451713 (fixed with CVE-2024-33060): Incorrect searching algorithm in fastrpc_mmap_find leads to kernel address space info leak

The fastrpc_mmap_create function calls the function fastrpc_mmap_find with attacker controlled arguments in order to identify existing maps that already fulfill the map creation request and if an existing map fulfills the request, it simply takes a refcount on that map and returns (this is an important aspect of CVE-2024-49848 too, as we’ll see in the next section!). In the case of global maps it performs the following:

hlist_for_each_entry_safe(map, n, &me->maps, hn) { 

    if (va >= map->va &&  //Is the userland provided va and len in the range of the map?

    va + len <= map->va + map->len && 

    map->fd == fd) { //And is the fd the same?

        if (refs) { 

            if (map->refs + 1 == INT_MAX) { 

                 spin_unlock_irqrestore(&me->hlock, irq_flags); 

                 return -ETOOMANYREFS; 

            } 

           map->refs++; 

        } 

        match = map; 

        break; 

    } 

}

While this code makes sense for fl local maps where map->va is set to a userland provided value, in the case of global maps they are set to a kernel struct page pointer that serves as an opaque handle for the allocated memory. Consequently, via fastrpc_internal_mem_map, an attacker can cause a userland provided value to be compared to a kernel struct page pointer. Furthermore the ioctl return value can differ based on whether the comparison returns true or false, allowing an attacker to brute force page pointer addresses associated with a fastrpc_map object.

CVE-2024-49848: FASTRPC_ATTR_KEEP_MAP logic bug allows fastrpc_internal_munmap_fd to racily free in-use mappings leading to UAF

One critically important aspect of the fastrpc_internal_mem_unmap and fastrpc_internal_munmap functions is their reliance on fastrpc_mmap_remove when trying to find a map to delete. This function contains a list of checks to attempt to ensure that it cannot free a map presently in use:

//Entered with fl mapping mutexes held

static int fastrpc_mmap_remove(struct fastrpc_file *fl, int fd, uintptr_t va,

                               size_t len, struct fastrpc_mmap **ppmap)

{

        struct fastrpc_mmap *match = NULL, *map;

        struct hlist_node *n;

        struct fastrpc_apps *me = &gfa;

        unsigned long irq_flags = 0;

...

        hlist_for_each_entry_safe(map, n, &fl->maps, hn) {

                if ((fd < 0 || map->fd == fd) && map->raddr == va &&

                        map->raddr + map->len == va + len &&

                        map->refs == 1 && //verifies only 1 reference (from map creation)

                        /* Remove if only one reference map and no context map */

                        !map->ctx_refs && //And that no context holds a reference (important because context creation can create maps as well)

                        /* Skip unmap if it is fastrpc shell memory */

                        !map->is_filemap) {

                        match = map;

                        hlist_del_init(&map->hn);

                        break;

                }

        }

        if (match) {

                *ppmap = match;

                return 0;

        }

        return -ETOOMANYREFS;

}

This function tries to ensure there can be no references to a map outside of the initial reference set upon creation of the map. This works as long as there are never any references to the map unassociated with an explicit reference (map->refs > 1 || map->ctx_refs > 0) concurrent to this function. This is in fact, precisely the invariant violated by CVE-2024-33060! In that case, we have a transient stack-based reference (from map creation) that doesn’t take an explicit reference (map->refs == 1) that is concurrent to this function (since the transient reference was held beyond the global mutexing lock).

This invariant check is clearly a bit fragile to begin with, but there are at least two other paths to potential map destruction that don’t utilize this fastrpc_mmap_remove path and the guarantees it tries to provide. One of these paths is fastrpc_internal_munmap_fd:

/*

 *        fastrpc_internal_munmap_fd can only be used for buffers

 *        mapped with persist attributes. This can only be called

 *        once for any persist buffer

 */

int fastrpc_internal_munmap_fd(struct fastrpc_file *fl,

                                struct fastrpc_ioctl_munmap_fd *ud)

{

        int err = 0;

        struct fastrpc_mmap *map = NULL;

        ...

        mutex_lock(&fl->internal_map_mutex);

        mutex_lock(&fl->map_mutex);

        err = fastrpc_mmap_find(fl, ud->fd, NULL, ud->va, ud->len, 0, 0, &map);

        if (err) {

                ...

                mutex_unlock(&fl->map_mutex);

                goto bail;

        }

        if (map && (map->attr & FASTRPC_ATTR_KEEP_MAP)) {

                map->attr = map->attr & (~FASTRPC_ATTR_KEEP_MAP);

                fastrpc_mmap_free(map, 0);

        }

        mutex_unlock(&fl->map_mutex);

bail:

        mutex_unlock(&fl->internal_map_mutex);

        return err;

}

We can see that this function finds a map and calls fastrpc_mmap_free on that map if the flag FASTRPC_ATTR_KEEP_MAP is set. It also unsets this flag, so it’s impossible to call this function on the same map more than once.  Looking inside of fastrpc_mmap_create we see a corresponding line of code that adds an additional reference in the case where this flag is set:

static int fastrpc_mmap_create(struct fastrpc_file *fl, int fd, struct dma_buf *buf,

        unsigned int attr, uintptr_t va, size_t len, int mflags,

        struct fastrpc_mmap **ppmap)

{

        ...

        map = kzalloc(sizeof(*map), GFP_KERNEL);

        ...

        INIT_HLIST_NODE(&map->hn);

        map->flags = mflags;

        map->refs = 1;

        map->fl = fl;

        map->fd = fd;

        map->attr = attr;

        ...

        map->ctx_refs = 0;

        ktime_get_real_ts64(&map->map_start_time);

        if (mflags == ADSP_MMAP_HEAP_ADDR ||

                                mflags == ADSP_MMAP_REMOTE_HEAP_ADDR) {

                ...

        } else if (mflags == FASTRPC_MAP_FD_NOMAP) {

                ...

        } else {

                if (map->attr && (map->attr & FASTRPC_ATTR_KEEP_MAP)) {

                        ADSPRPC_INFO("buffer mapped with persist attr 0x%x\n",

                                (unsigned int)map->attr);

                        map->refs = 2; //References increases to 2

                }

                ...

                map->va = va;

        }

        map->len = len;

        ...

        fastrpc_mmap_add(map);

        *ppmap = map;

bail:

        ...

        return err;

}

However we can see that map->refs is only bumped in the default case where mflags isn’t equal to one of ADSP_MMAP_HEAP_ADDR, ADSP_MMAP_REMOTE_HEAP_ADDR or FASTRPC_MAP_FD_NOMAP. We can also see that regardless of the mflags value, it is possible to set FASTRPC_ATTR_KEEP_MAP - so it is still possible to create a FASTRPC_ATTR_KEEP_MAP map with map->refs == 1! This means that the map is visible to a fastrpc_internal_munmap_fd call which doesn’t guarantee the invariant provided by fastrpc_mmap_remove is unviolated when fastrpc_mmap_free gets called. This can be a problem for example, when a context takes a reference to a FASTRPC_ATTR_KEEP_MAP, FASTRPC_MAP_FD_NOMAP mapping in get_args:

static int get_args(uint32_t kernel, struct smq_invoke_ctx *ctx)

{

        remote_arg64_t *rpra, *lrpra;

        remote_arg_t *lpra = ctx->lpra;

        ...

        int mflags = 0;

        ...

        for (i = 0; i < bufs; ++i) {

                uintptr_t buf = (uintptr_t)lpra[i].buf.pv;

                size_t len = lpra[i].buf.len;

                mutex_lock(&ctx->fl->map_mutex);

                if (ctx->fds && (ctx->fds[i] != -1))

                        err = fastrpc_mmap_create(ctx->fl, ctx->fds[i], NULL,

                                        ctx->attrs[i], buf, len,

                                        mflags, &ctx->maps[i]);//Can take a reference to an existing mapping

                if (ctx->maps[i])

                        ctx->maps[i]->ctx_refs++;

                mutex_unlock(&ctx->fl->map_mutex);

...

        }

map->ctx_refs is set to greater than zero, but this does not stop fastrpc_internal_munmap_fd’s call to fastrpc_mmap_free from calling kfree on the map:

static void fastrpc_mmap_free(struct fastrpc_mmap *map, uint32_t flags)

{

        ...

        if (map->flags == ADSP_MMAP_HEAP_ADDR ||

                                map->flags == ADSP_MMAP_REMOTE_HEAP_ADDR) {

                ...

        } else {

                map->refs--;

                if (!map->refs && !map->ctx_refs)

                        hlist_del_init(&map->hn);

                if (map->refs > 0 && !flags) //This is the only relevant bailout to avoid freeing map - ctx_refs value does not influence free decision

                        return;

        }

        ...

bail:

        if (!map->is_persistent)

                kfree(map); //Map is destroyed here!

}

This means it’s theoretically possible to create a UAF mapping if a fastrpc context holds the only reference to a FASTRPC_MAP_FD_NOMAP mapping with the FASTRPC_ATTR_KEEP_MAP attribute set. This cocktail of flags and attributes is possible with fastrpc_internal_mem_map, however it’s not immediately apparent how to cause a context to hold the only reference. The only intended circumstance where a context holds the sole reference to a map is when context creation and initialization leads directly to map creation - but in that map creation path, it’s not possible to specify the flags and attributes necessary to create the needed edge case. We need to have fastrpc_internal_mem_map create the map, have a context take a reference, and then somehow drop the initial reference that map creation provides. We cannot do this with fastrpc_internal_mem_unmap (because of the guarantees provided by fastrpc_mmap_remove) but we CAN racily do this by taking the fastrpc_internal_mem_map bailout after the map is created and a context has taken a reference!

int fastrpc_internal_mem_map(struct fastrpc_file *fl,

                                struct fastrpc_ioctl_mem_map *ud)

{

        int err = 0;

        struct fastrpc_mmap *map = NULL;

        mutex_lock(&fl->internal_map_mutex);

        ...

        mutex_lock(&fl->map_mutex);

        VERIFY(err, !(err = fastrpc_mmap_create(fl, ud->m.fd, NULL, ud->m.attrs,

                        ud->m.vaddrin, ud->m.length,

                         ud->m.flags, &map))); //Create the map here

        mutex_unlock(&fl->map_mutex);

        ... //Have a context take a reference to the created map here

        VERIFY(err, !(err = fastrpc_mem_map_to_dsp(fl, ud->m.fd, ud->m.offset,

                ud->m.flags, map->va, map->phys, map->size, &map->raddr))); //Fail this

        if (err)

                goto bail; //bailout

        ud->m.vaddrout = map->raddr;

bail:

        if (err) {

                ...

                if (map) {

                        mutex_lock(&fl->map_mutex);

                        fastrpc_mmap_free(map, 0); //Drop reference leaving the context as the only reference holder

                        mutex_unlock(&fl->map_mutex);

                }

        }

        mutex_unlock(&fl->internal_map_mutex);

        return err;

}

Consider two concurrent processes (A and B) implementing the following sequence of events:

[A]: Completely fills the dsp address space with valid mappings using fastrpc_internal_mem_map

[A]: Creates a FASTRPC_MAP_FD_NOMAP map with attribute FASTRPC_ATTR_KEEP_MAP using fastrpc_internal_mem_map and enters into fastrpc_mem_map_to_dsp (holding internal_map_mutex, dropped map_mutex)

map->refs == 1, map->ctx_refs == 0

[B]: Invokes a call using FASTRPC_IOCTL_INVOKE2 and creates a context, get_args grabs the map mutex, finds and grabs a reference to map, drops the map mutex (not holding any mutexes)

map->refs == 2, map->ctx_refs == 1

[A]: fastrpc_mem_map_to_dsp fails as the dsp address space is completely full, fastrpc_internal_mem_map bails out and calls fastrpc_mmap_free, dropping the internal_map_mutex (not holding any mutexes)

map->refs == 1, map->ctx_refs == 1

[A]: Calls fastrpc_internal_munmap_fd grabs internal_map_mutex, and map_mutex, finds map with fastrpc_mmap_find, and calls fastrpc_mmap_free because the FASTRPC_ATTR_KEEP_MAP attribute is set

map->refs == 0, map->ctx_refs == 1, mapping is kfree'd

At the end of this sequence, an existing context still holds a reference to the freed map. An example crash from this bug is:

[42694.423088] [0:            poc: 7171] Unable to handle kernel paging request at virtual address 006b6b6b6b6b6c27

[42694.423153] [0:            poc: 7171] PC Code: b4000115 7100111f 540000c0 7100211f 54000080 (b940beb4) 71001e9f 54001ba2 7100211f 54000060 7100111f 54000c21 d0000120 91040000 95658fb7 b9406a68 aa0003e1 71000508 b9006a68 54000181

[42694.423167] [0:            poc: 7171] LR Code: 2a1f03e1 97ffeafe (91002294) eb1402bf

[42694.423229] [0:            poc: 7171] Mem abort info:

[42694.423236] [0:            poc: 7171]   ESR = 0x96000004

[42694.423243] [0:            poc: 7171]   EC = 0x25: DABT (current EL), IL = 32 bits

[42694.423259] [0:            poc: 7171]   SET = 0, FnV = 0

[42694.423265] [0:            poc: 7171]   EA = 0, S1PTW = 0

[42694.423270] [0:            poc: 7171]   FSC = 0x04: level 0 translation fault

[42694.423277] [0:            poc: 7171] Data abort info:

[42694.423281] [0:            poc: 7171]   ISV = 0, ISS = 0x00000004

[42694.423288] [0:            poc: 7171]   CM = 0, WnR = 0

[42694.423294] [0:            poc: 7171] [006b6b6b6b6b6c27] address between user and kernel address ranges

[42694.423304] [0:            poc: 7171] Internal error: Oops: 96000004 [#1] PREEMPT SMP

...

[42694.424942] [0:            poc: 7171] Hardware name: Samsung DM1Q PROJECT (board-id,13) (DT)

[42694.424947] [0:            poc: 7171] pstate: 22400005 (nzCv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--)

[42694.424954] [0:            poc: 7171] pc : fastrpc_mmap_free+0x58/0x734 [frpc_adsprpc]

[42694.425067] [0:            poc: 7171] lr : context_free+0x130/0x2cc [frpc_adsprpc]

[42694.425173] [0:            poc: 7171] sp : ffffffc036c4b8d0

[42694.425178] [0:            poc: 7171] x29: ffffffc036c4b920 x28: 0000000000000000 x27: ffffffc036c4bba8

[42694.425190] [0:            poc: 7171] x26: 0000000000000003 x25: 00000049216800d0 x24: ffffffc003d62390

[42694.425198] [0:            poc: 7171] x23: ffffffc003d4ff88 x22: 0000000000000002 x21: 6b6b6b6b6b6b6b6b

[42694.425206] [0:            poc: 7171] x20: 00000000ffffffff x19: ffffff88ef024d00 x18: ffffffc01e79d028

[42694.425215] [0:            poc: 7171] x17: ffffffffffffffff x16: 0000000000000004 x15: 0000000000000004

[42694.425222] [0:            poc: 7171] x14: ffffff8957f20000 x13: 0000000000001c4a x12: 0000000000000003

[42694.425231] [0:            poc: 7171] x11: 0000000100449c4a x10: ffffff8846550040 x9 : 0000000000000000

[42694.425239] [0:            poc: 7171] x8 : 000000006b6b6b6b x7 : 2820637072707364 x6 : 61203a726f727245

[42694.425247] [0:            poc: 7171] x5 : ffffff88002ada57 x4 : 6363346478302041 x3 : 0000000000000000

[42694.425255] [0:            poc: 7171] x2 : ffffff8846550040 x1 : 0000000000000000 x0 : ffffff88ef024d00

[42694.425263] [0:            poc: 7171] Call trace:

[42694.425267] [0:            poc: 7171]  fastrpc_mmap_free+0x58/0x734 [frpc_adsprpc]

[42694.425373] [0:            poc: 7171]  context_free+0x130/0x2cc [frpc_adsprpc]

[42694.425479] [0:            poc: 7171]  fastrpc_internal_invoke+0xb88/0x1ef4 [frpc_adsprpc]

[42694.425584] [0:            poc: 7171]  fastrpc_internal_invoke2+0x320/0x408 [frpc_adsprpc]

[42694.425689] [0:            poc: 7171]  fastrpc_device_ioctl+0x1b0/0x92c [frpc_adsprpc]

[42694.425795] [0:            poc: 7171]  __arm64_sys_ioctl+0x120/0x170

[42694.425805] [0:            poc: 7171]  invoke_syscall+0x58/0x13c

[42694.425811] [0:            poc: 7171]  el0_svc_common+0xb4/0xf0

[42694.425817] [0:            poc: 7171]  do_el0_svc+0x24/0x90

[42694.425823] [0:            poc: 7171]  el0_svc+0x20/0x7c

[42694.425828] [0:            poc: 7171]  el0t_64_sync_handler+0x84/0xe4

[42694.425833] [0:            poc: 7171]  el0t_64_sync+0x1b8/0x1bc

[42694.425842] [0:            poc: 7171] Code: 7100111f 540000c0 7100211f 54000080 (b940beb4)

[42694.425853] [0:            poc: 7171] ---[ end trace 7349f07610aa0ad6 ]---

[42694.425862] [0:            poc: 7171] Kernel panic - not syncing: Oops: Fatal exception

CVE-2024-43047 (ITW): Map collision leads to UAF on 4.x kernels, and some 5.x kernel configurations

At this point, it became apparent that two of these issues bear more than a passing resemblance to the exploit logs. The kernel panics (particularly the first one) strongly suggest that a fastrpc_mmap struct is involved in the initial memory corruption primitive, and the exploit makes calls to several ioctls that are responsible for the creation and administration of these structs. Additionally we previously saw memory corruption in the chain of structs/members map->fl->cid. The logs make a lot of sense in the context of a UAF of this fastrpc_mmap struct, meaning that CVE-2024-33060 and CVE-2024-49848 stand out as particularly plausible candidates to have been the ITW issue used by the attacker. However, if the exploit were triggering CVE-2024-33060, I would have expected it to generate some kernel log lines that we don’t see in the ITW artifacts. CVE-2024-49848 is disqualified on the basis of working upon strictly newer kernels than the ITW exploit was used against.

Complicating the identification of the ITW bug, the exploit repeatedly hits several early bailouts in the ioctl handlers - these bailouts have no discernible side effect and it’s unclear why the exploit exercises this code at all. It is difficult to know what log lines from the exploit are due to this behavior and what log lines are related to the exploit exercising the bug used. But the kernel panics themselves do not lie. They are indisputable records of memory corruption, and (particularly in cases where the crash came from the exploit program itself) their associated stack traces are strong indicators of proximity to the bug or to the exploit strategy.

A context treats references to a map differently depending on if the map is used as a buffer or as a handle. The context has a dynamically allocated array for map pointer references in ctx->maps, and both buffers and handles are referenced in this array.

static int get_args(uint32_t kernel, struct smq_invoke_ctx *ctx)

{

        ...

        uint32_t sc = ctx->sc;

        int inbufs = REMOTE_SCALARS_INBUFS(sc);

        int outbufs = REMOTE_SCALARS_OUTBUFS(sc);

        int handles, bufs = inbufs + outbufs;

        ...

        for (i = 0; i < bufs; ++i) { //buffer references

                ...

                if (ctx->fds && (ctx->fds[i] != -1))

                        err = fastrpc_mmap_create(ctx->fl, ctx->fds[i], NULL,

                                        ctx->attrs[i], buf, len,

                                        mflags, &ctx->maps[i]);

                if (ctx->maps[i])

                        ctx->maps[i]->ctx_refs++;

                ...

        }

        ...

        handles = REMOTE_SCALARS_INHANDLES(sc) + REMOTE_SCALARS_OUTHANDLES(sc);

        ...

        for (i = bufs; i < bufs + handles; i++) { //handle references

                ...

                if (!dsp_cap_ptr->dsp_attributes[DMA_HANDLE_REVERSE_RPC_CAP] &&

                                        ctx->fds && (ctx->fds[i] != -1))

                        err = fastrpc_mmap_create(ctx->fl, ctx->fds[i], NULL,

                                        FASTRPC_ATTR_NOVA, 0, 0, dmaflags,

                                        &ctx->maps[i]);

                if (!err && ctx->maps[i])

                        ctx->maps[i]->ctx_refs++;

                if (err) {

                        for (j = bufs; j < i; j++) {

                                if (ctx->maps[j] && ctx->maps[j]->ctx_refs)

                                        ctx->maps[j]->ctx_refs--;

                                fastrpc_mmap_free(ctx->maps[j], 0);

                        }

                        ...

                }

                ...

        }

        mutex_unlock(&ctx->fl->map_mutex);

However only buffers are freed using the pointer reference in ctx->maps. In fact, once handle maps are initialized without error, their reference in ctx->maps is never used again. Instead, the DSP is given a list of values associated with map file descriptors, and later passes these file descriptors back to the adsprpc driver in the AP once it is done using them. The AP then finds a map associable with the file descriptor returned by the DSP and drops a reference on it, potentially freeing it:

static int put_args(uint32_t kernel, struct smq_invoke_ctx *ctx,

                    remote_arg_t *upra)

{
...

for (i = 0; i < M_FDLIST; i++) {

                if (!fdlist[i])

                        break;

        if (!fastrpc_mmap_find(ctx->fl, (int)fdlist[i], NULL, 0, 0,

                                        0, 0, &mmap)) {

                        if (mmap && mmap->ctx_refs)

                                mmap->ctx_refs--;

                fastrpc_mmap_free(mmap, 0);

                }
...

}

This is wrong because there is no guarantee that the map found and de-refcounted in put_args will be the same map that get_args previously took a reference on and it in fact could be a map still referenced by another context as a buffer. I discovered that this can happen if there are map collisions (cases where two created maps would fulfill a fastrpc_mmap_find request). Since fastrpc_mmap_find will find any mapping that encompasses the searched virtual-address range, you can create a mapping B that collides with future searches for mapping A by comprising a superset of mapping A's virtual address range, e.g. by setting mapB->va == mapA->va && mapB->len > mapA->len.

An example of this map collision that leads to memory corruption is:

  1. Create a small mapping A with va == 0 using fastrpc_internal_mmap
  2. Create context 1, get_args gets a reference to mapping A as a handle.
  3. Create a BIG second mapping B with va == 0 using fastrpc_internal_mmap with the same fd as mapping A.
  4. Create context 2, grab a reference to mapping B as a buffer (vs a handle) so that we use the ctx->maps pointer.
  5. Complete context 1, causing put_args to be called. We find and drop a refcount on mapping B since it collides with mapping A. Mapping A’s refcount is permanently leaked.
  6. We then unmap mapping B using FASTRPC_IOCTL_MUNMAP

Now there is a still valid context (context 2 in the above example) that still has a reference to mapping B even though mapping B was freed. Here’s an example kernel panic when triggering this bug:

[93168.108618]  [3:            poc: 8227] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018

[93168.108656]  [3:            poc: 8227] Mem abort info:

[93168.108675]  [3:            poc: 8227]   ESR = 0x96000006

[93168.108696]  [3:            poc: 8227]   Exception class = DABT (current EL), IL = 32 bits

[93168.108716]  [3:            poc: 8227]   SET = 0, FnV = 0

[93168.108735]  [3:            poc: 8227]   EA = 0, S1PTW = 0

[93168.108754]  [3:            poc: 8227] Data abort info:

[93168.108773]  [3:            poc: 8227]   ISV = 0, ISS = 0x00000006

[93168.108792]  [3:            poc: 8227]   CM = 0, WnR = 0

[93168.108816]  [3:            poc: 8227] user pgtable: 4k pages, 39-bit VAs, pgdp = 00000000a5933260

[93168.108837]  [3:            poc: 8227] [0000000000000018] pgd=08000001926ef003, pud=08000001926ef003, pmd=0000000000000000

[93168.108905]  [3:            poc: 8227] Internal error: Oops: 96000006 [#1] PREEMPT SMP

[93168.108928]  [3:            poc: 8227] Modules linked in:

[93168.108951]  [3:            poc: 8227] Process poc (pid: 8227, stack limit = 0x0000000041a3591e)

[93168.108979]  [3:            poc: 8227] CPU: 3 PID: 8227 Comm: poc FTT: 0 0 Tainted: G S      W         4.19.113-27095354 #1

[93168.109000]  [3:            poc: 8227] Hardware name: Samsung X1Q PROJECT - SM-G981V_REV0.3_PV3 (board-id,20) (DT)

[93168.109024]  [3:            poc: 8227] pstate: 80400005 (Nzcv daif +PAN -UAO)

[93168.109047]  [3:            poc: 8227] pc : dma_buf_unmap_attachment+0x20/0x58

[93168.109069]  [3:            poc: 8227] lr : fastrpc_mmap_free+0x3dc/0x4e8

[93168.109088]  [3:            poc: 8227] sp : ffffff80302239c0

[93168.109107]  [3:            poc: 8227] x29: ffffff80302239c0 x28: 0000000000000002 

[93168.109131]  [3:            poc: 8227] x27: 0000000000010100 x26: ffffff8030223bf0 

[93168.109154]  [3:            poc: 8227] x25: ffffffc0c518f410 x24: ffffffc09152ce00 

[93168.109177]  [3:            poc: 8227] x23: ffffff800b48d0b0 x22: 0000000000010000 

[93168.109199]  [3:            poc: 8227] x21: 00000004f79e0000 x20: 0000000000000003 

[93168.109222]  [3:            poc: 8227] x19: ffffffc0145d2e00 x18: 0000000000000000 

[93168.109244]  [3:            poc: 8227] x17: 0000000000000000 x16: a70d810816bbf5af 

[93168.109267]  [3:            poc: 8227] x15: 0000000000000008 x14: 0000000080000000 

[93168.109289]  [3:            poc: 8227] x13: 0000000034155555 x12: 001d067cf6237800 

[93168.109312]  [3:            poc: 8227] x11: 0000000000000007 x10: 0000000000000003 

[93168.109334]  [3:            poc: 8227] x9 : 0000000000000000 x8 : 0000000000000000 

[93168.109357]  [3:            poc: 8227] x7 : 0001010000000004 x6 : ffffff8030223c08 

[93168.109379]  [3:            poc: 8227] x5 : ffffff8030223c08 x4 : 0000000000000004 

[93168.109401]  [3:            poc: 8227] x3 : 0000000037c648b0 x2 : 0000000000000000 

[93168.109423]  [3:            poc: 8227] x1 : ffffffc111df9800 x0 : ffffffc111df9680 

[93168.114754]  [3:            poc: 8227] Call trace:

[93168.114777]  [3:            poc: 8227]  dma_buf_unmap_attachment+0x20/0x58

[93168.114799]  [3:            poc: 8227]  fastrpc_mmap_free+0x3dc/0x4e8

[93168.114821]  [3:            poc: 8227]  fastrpc_internal_invoke+0x1654/0x24b0

[93168.114843]  [3:            poc: 8^R27]  fastrpc_device_ioctl+0xcd4/0x1df8

[93168.114865]  [3:            poc: 8227]  do_vfs_ioctl+0x6f0/0xac0

[93168.114887]  [3:            poc: 8227]  __arm64_sys_ioctl+0x74/0xa8

[93168.114909]  [3:            poc: 8227]  el0_svc_common+0xd8/0x188

[93168.114931]  [3:            poc: 8227]  el0_svc_handler+0x6c/0x90

[93168.114952]  [3:            poc: 8227]  el0_svc+0x8/0x280

[93168.114977]  [3:            poc: 8227] Code: b4000121 f9400008 b40000e8 f9401108 (f9400d08) 

[93168.115002]  [3:            poc: 8227] ---[ end trace 2a2f45652740934f ]---

[93168.197067]  [3:            poc: 8227] Kernel panic - not syncing: Fatal exception

Another researcher, Conghui Wang appears to have discovered a different path to reach the same bug that does not involve map collisions - since the code running on the DSP can be from an unsigned ELF binary uploaded by the attacker, it’s possible for an attacker to simply have the DSP send back bogus fd’s in an RPC response, causing the AP kernel to free mappings that are still referenced.

This bug retains a similarly high level of resemblance to the original In-The-Wild exploit as bugs CVE-2024-33060 and CVE-2024-49848 did. It appears to be a UAF on the same object that was likely exploited given the ITW kernel logs (struct fastrpc_mmap), and unlike the other discovered bugs cannot be disqualified on the basis of non-intersecting version ranges or log messages. While we cannot prove beyond a doubt that this is the same bug used by the ITW attacker, after careful consideration TAG and Project Zero determined that this issue met the bar for being considered as exploited ITW. We are confident that this driver is under active exploitation by real-world attackers and that all the bugs resolved as part of this research had an outsized impact in preventing in-the-wild exploitation.

Excavating an exploit

After discovering these issues, I looked into potential ways to exploit a fastrpc_mmap struct vulnerability. Finding a compatible object to heap spray that would yield a meaningfully improved primitive (especially without an ASLR leak) is not a trivial task for this bug. While looking across the Linux kernel for useful objects to spray, my attention was once again drawn back to the ITW exploit logs, in particular the logs we saw previously involving an invalid channel id:

[   40.917577] adsprpc: ERROR:fastrpc_mmap_free, Invalid channel id: 1702834303, err:-44

This value is coming from map->fl->cid - so it seems exceedingly likely that whatever heap spray object the attacker used had a timestamp value at this offset location. Perhaps it would be possible to discover what object the attacker was heap spraying! After searching fruitlessly for cases of time-related members of variable-sized structs, my attention was drawn to kernel panic log 3:

[ 2247.159424] Unable to handle kernel paging request at virtual address 006f7778a9cf5b88

...

[ 2247.159719] Call trace:

[ 2247.159727]  __kmalloc+0x1c4/0x398

[ 2247.159740]  inotify_handle_event+0xc8/0x1c8

[ 2247.159746]  fsnotify+0x270/0x378

We see here a crash that occurred in a heap allocation crucially before the invalid channel id messages typically appeared in the log. Looking at inotify_handle_event more closely we see the following variable-sized allocation:

int inotify_handle_event(struct fsnotify_group *group,

                         struct inode *inode,

                         u32 mask, const void *data, int data_type,

                         const unsigned char *file_name, u32 cookie,

                         struct fsnotify_iter_info *iter_info)

{

        ...

        struct inotify_event_info *event;

        ...

        int alloc_len = sizeof(struct inotify_event_info);

        ...

        if (file_name) {

                len = strlen(file_name);

                alloc_len += len + 1;

        }

        ...

        event = kmalloc(alloc_len, GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL);

I then compared the struct offsets for map->fl->cid and superimposed them onto a struct inotify_event_info object. We see that map->fl->cid aligns perfectly with event->fse.inode->i_mtime - the modification time for the file inode associated with the inotify event! Given that the cid is set to a timestamp value by the exploit, this makes perfect sense. It’s highly likely that part of the ITW exploit process involved spraying inotify_event_info objects in order to reclaim a UAF’d fastrpc_mmap struct (perhaps as part of an ASLR leak), and that the unrealistic channel id value is the modification time for whatever file the exploit used for the heap spray. Based on the pipe crash, it seems plausible it then reallocates the fastrpc_mmap struct with pipe buffers to gain full control of the underlying object. This, plus an info leak, would be more than enough to achieve code execution or arbitrary read/write.

Conclusion

It took less than 3 months of research to discover 6 separate bugs in the adsprpc driver, two of which (CVE-2024-49848 and CVE-2024-21455) were not fixed by Qualcomm under the industry standard 90-day deadline. Furthermore, at the time of writing, CVE-2024-49848 remains unfixed 145 days after it was reported. Past research has shown that chipset drivers for Android are a promising target for attackers, and this ITW exploit represents a meaningful real-world example of the negative ramifications that the current third-party vendor driver security posture poses to end-users. A system’s cybersecurity is only as strong as its weakest link, and chipset/GPU drivers represent one of the weakest links for privilege separation on Android in 2024. Improving both the consistency and quality of code and the efficiency of the third-party vendor driver patch dissemination process are crucial next steps in order to increase the difficulty of privilege escalation on Android devices.