Thursday, September 28, 2017

Over The Air - Vol. 2, Pt. 1: Exploiting The Wi-Fi Stack on Apple Devices

Posted by Gal Beniamini, Project Zero

Earlier this year we performed research into Broadcom’s Wi-Fi stack. Due to the ubiquity of Broadcom’s stack, we chose to conduct our prior research through the lens of one affected family of products -- the Android ecosystem. To paint a more complete picture of the state of Wi-Fi security in the mobile ecosystem, we’ve chosen to revisit the topic - this time through the lens of Apple devices. In this research we’ll perform a deeper dive into each of the affected components, discover new attack surfaces, and finally construct a full over-the-air exploit chain against iPhones, allowing complete control over the target device.

Since there’s much ground to cover, we’ve chosen to split the research into a three-part blog series. The first blog post will focus on exploring the Wi-Fi stack itself and developing the necessary research tools to explore it on the iPhone. In the second blog post, we’ll perform research into the Wi-Fi firmware, discover multiple vulnerabilities, and develop an exploit allowing attackers to execute arbitrary code on the Wi-Fi chip itself, requiring no user-interaction. Lastly, in the final blog post we’ll explore the iPhone’s host isolation mechanisms, research the ways in which the Wi-Fi chip interacts with the host, and develop a fully-fledged exploit allowing attackers to gain complete control over the iOS kernel over-the-air, requiring no user interaction.
As we’ve mentioned before, Broadcom’s chips are present in a wide variety of devices - ranging from mobile phones to laptops (such as Chromebooks) and even Wi-Fi routers. While we’ve chosen to focus our attention on the Apple ecosystem this time around, it’s worth mentioning that the Wi-Fi firmware vulnerabilities presented in this research affect other devices as well. Additionally, as this research deals with a different attack surface in the Wi-Fi firmware, the breadth of affected devices might be wider than that of our prior research.

More concretely, the Wi-Fi vulnerabilities presented in this research affect many devices in the Android ecosystem. For example, two of the vulnerabilities (#1, #2) affect most of Samsung’s flagship devices, including the Galaxy S8, Galaxy S7 Edge and Galaxy S7. Of the two, one vulnerability is also known to affect Google devices such as the Nexus 6P, and some models of Chromebooks. As for Apple’s ecosystem, while this research deals primarily with iPhones, other devices including Apple TV and iWatch are similarly affected by our findings. The exact breadth of other affected devices has not been investigated further, but is assumed to be wider.

We’d also like to note that until hardware host isolation mechanisms are implemented across the Android ecosystem, every exploitable Wi-Fi firmware vulnerability directly results in complete host takeover. In our previous research we identified the lack of host isolation mechanisms on two of the most prominent SoC platforms; Qualcomm’s Snapdragon 810 and Samsung’s Exynos 8890. We are not aware of any advances in this regard, as of yet.

For the purpose of this research, we’ll demonstrate remote code execution on the iPhone 7 (the most recent iDevice at the time of this research), running iOS 10.2 (14C92). The vulnerabilities presented in this research are present in iOS up to (and including) version 10.3.3 (apart from #1, which was fixed in 10.3.3). Researchers wishing to port the provided research tools and exploits to other versions of iOS or to other iDevices would be required to adjust the referenced symbols.

Over the course of the blog post, we’ll begin fleshing out a memory research platform for iOS. Throughout this blog post series, we’ll rely on the framework extensively, to both analyse and explore components on the system, including the XNU kernel, hardware components, and the Wi-Fi chipset itself.

The vulnerabilities affecting Apple devices have been addressed in iOS 11. Similarly, those affecting Android have been addressed in the September bulletin. Note that within the Android ecosystem, OEMs bear the responsibility for providing their own Wi-Fi firmware images (partially due to their high level of customisation). Therefore the corresponding fixes should appear in the vendors’ own bulletins, rather than Android’s security bulletin.

Creating a Research Platform


Before we can begin exploring, we’ll need to lay down the groundwork first. Ideally, we’d like to create our own debugger -- allowing us to both inspect and instrument the Wi-Fi firmware, thereby making exploration (and subsequent exploit development) much easier.

During our previous research into Broadcom’s Wi-Fi chip within the context of the Android ecosystem, this task turned out to be much more straight-forward than expected. Instead of having to create an entire research environment from scratch, we relied on several properties provided by the Android ecosystem to speed up the development phase.

For starters, many Android devices allow developers to intentionally bypass their security model, using “rooted” builds (such as userdebug). Flashing such a build onto a device allows us to freely explore and interact with many components on the system. As the security model is only bypassed explicitly, the odds of side-effects resulting from our research affecting the system’s behaviour are rather slim.

Additionally, Broadcom provides their own debugging tools to the Android ecosystem, consisting of a command-line utility and a dedicated set of ioctls within Broadcom’s device driver, bcmdhd. These tools allow sufficiently privileged users to interact with the Wi-Fi chip in a variety of ways, including the ability to access the chip’s RAM directly -- an essential primitive when constructing a debugger. Basing our own toolset on this platform allowed us to create a rather comfortable research environment.

Furthermore, Android utilises the Linux Kernel, which is licensed under GPLv2. Therefore, the kernel’s source code, including that of the device drivers, is freely available. Reading through Broadcom’s device driver (bcmdhd) turned out to be an invaluable resource -- sparing us some unnecessary reverse-engineering while also allowing us to easily assess the ways in which the chip and host interact with one another.

Lastly, some of the data sheets pertaining to the Wi-Fi SoCs used on Android devices were made publicly available by Cypress following their acquisition of Broadcom’s IoT business. While most of the information in the data sheets is irrelevant to our research, we were able to gather a handful of useful clues regarding the architecture of the SoC itself.
Unfortunately, it appears we have no such luck this time around!

First, Apple does not provide a “developer-mode” iPhone, nor is there a mechanism to selectively bypass the security model. This means that in order to meaningfully explore the system, researchers are forced to subvert the device’s security model (i.e., by jailbreaking). Consequently, exploring different components within the device is made much more difficult.

Additionally, unlike the Android ecosystem, Apple has chosen to develop their entire host-side stack “from scratch”. Most importantly, the iOS drivers used to interact with Broadcom’s chip are written by Apple, and are not based on Broadcom’s FullMAC drivers (bcmdhd or brcmfmac). Other host-side utilities, such as Broadcom’s debugging toolchain, are thus also not included.

That said, Apple did develop their own mechanisms for accessing and debugging the chip. These capabilities are exposed via a set of privileged ioctls embedded in the IO80211Family driver. While the interface itself is undocumented, reverse-engineering the corresponding components in both the IO80211Family and AppleBCMWLANCore drivers reveals a rather powerful command channel, and one which could possibly be used for the purposes of our research. Unfortunately, access to this interface requires additional entitlements, thus preventing us from leveraging it (unless we escalate our privileges).

Lastly, there’s no overlap between the revisions of Wi-Fi chips used on Apple’s devices and those used in the Android ecosystem. As we’ll see later on, this might be due to the fact that Apple-specific Wi-Fi chips contain Apple-specific features. Regardless, perhaps unsurprisingly, none of the corresponding data sheets for these SoCs have been made available.
So… it appears we’ll have to deal with a proprietary chip, on a proprietary device running a proprietary operating system. We have our work cut out for us! That said, it’s not all doom and gloom; instead of relying on all of the above, we’ll just need to create our own independent research platform.

Acquiring the ROM?


Let’s start by analysing the SoC’s firmware and loading it up into a disassembler. As we’ve seen in the previous round of research, the Wi-Fi firmware consists of a small chunk of ROM containing most of the firmware’s data and code, and a larger blob of RAM housing all of the runtime data structures (such as the heap and stack), as well as patches to the ROM’s code.

Since the RAM blob is loaded into the Wi-Fi chip during its initialisation by the host, it should be accessible via the host’s root filesystem. Indeed, after downloading the iPhone’s firmware image, extracting the root filesystem and searching for indicative strings, we are greeted with the following result:
Great, so we’ve identified the firmware’s RAM. What’s more, it appears that the Wi-Fi chip embedded in the phone is a BCM4355C0, a model which I haven’t come across in Android devices in the past (also, it curiously does not appear under Broadcom’s website).

Regardless, having the RAM image is all well and good, but what about the ROM? After all, the majority of the code is stored in the chip’s ROM. Even if we were to settle for analysing the RAM alone, it’d be extremely difficult to reverse-engineer independently of the ROM as many of the functions in the former address data stored in the latter. Without knowing the ROM’s contents, or even its rudimentary layout, we’ll have to resort to guesswork.

However, this is where we run into a bit of a snag! To extract the ROM we’ll need to interact with the Wi-Fi chip itself... Whereas on Android we could simply use a “rooted” build to gain elevated privileges, and then access the Wi-Fi SoC via Broadcom’s debugging utilities, there are no comparable mechanisms on the iPhone. In that case, how will we interact with the chip and ultimately extract its ROM?

We could opt for a hardware-based research environment. Reviewing the data sheets for one of Broadcom’s Wi-Fi SoCs, BCM4339, reveals several interfaces through which the chip may be debugged, including UART and a JTAG interface.
That said, there are several disadvantages to this approach. First, we’d need to open up the device, locate the required interfaces, and make sure that we do not damage the phone in the process. Moreover, requiring a such a setup for each research device would cause us to incur significant start-up overhead. Perhaps most importantly, relying on a hardware-based approach would limit the amount of researchers who’d be willing to utilise our research platform -- both because hardware is a relatively specialised skill-set, and since people might (rightly) be wary of causing damage to their own devices.

So what about a completely software-based solution? After all, on Android devices we were able to access the chip’s memory solely using software. Perhaps a similar solution would apply to Apple devices?

To answer this question, let’s trace our way through the Android components involved in the control flow for accessing the Wi-Fi chip’s memory from the host. The flow begins with a user issuing a memory access command via Broadcom’s debugging utility (“membytes”). This, in turn, triggers an ioctl to Broadcom’s driver, requesting the memory access operation. After some processing within the driver, it performs the requested action by directly accessing the chip’s tightly-coupled memory (TCM) from the kernel’s Virtual Address-Space (VAS).

Two Registers Walk Into a BAR


As we’re mostly interested in the latter part, let’s disregard the Android-specific components for now and focus on the mechanism in bcmdhd allowing TCM access from the host.

Reviewing the driver’s code allows us to arrive at relevant code flow. First, the driver enables the PCIe-connected Wi-Fi chip. Then, it accesses the PCIe Configuration Space to program the Wi-Fi chip’s Base Address Registers (BARs). In keeping with the PCI standards, programming and mapping in the BARs into the host’s address space exposes functionality directly from the Wi-Fi SoC to the host, such as IO-Space or Memory Space access. Taking a closer look at Broadcom’s chips, they seem to provide two BARs in their configuration space; BAR0 and BAR1.

BAR0 is used to map-in registers corresponding to the different cores on the Wi-Fi SoC, including the ARM processor running the firmware’s logic, and more esoteric components such as the PCIe Gen 2 core on the Wi-Fi SoC. The cores themselves can be selected by accessing the PCIe configuration space once again, and programming the “BAR0 Window” register, directing it at the backplane address corresponding to the requested core.

BAR1, on the other hand, is used solely to map the Wi-Fi chip’s TCM into the host. Since Broadcom’s driver leverages the TCM access capability extensively, it maps-in BAR1 into the kernel’s virtual address space during the device’s initialisation, and doesn’t unmap it until the device shuts down. Once the TCM is mapped into the kernel, all subsequent memory accesses to the chip’s TCM are performed by simply modifying the mapped block within the kernel’s VAS. Any write operations made to the memory-mapped block are automatically reflected to the Wi-Fi chip’s RAM.
This is all well and good, but what about iOS? Since Apple develops their own drivers for interacting with Broadcom’s chips, what holds true in Broadcom’s drivers doesn’t necessarily apply to Apple’s drivers. After all, we could think of many different approaches to accessing the chip’s memory. For example, instead of mapping the entire TCM into the kernel’s memory, they might elect to only map-in certain regions of the TCM, to map it only on-demand, or even to rely on different chip-access mechanisms altogether.

To get to the bottom of this, we’ll need to start reverse-engineering Apple’s drivers. This can be done by extracting the kernelcache from the iPhone’s firmware and loading it into our favourite disassembler. After loading the kernel, we immediately come across two driver KEXTs related to Broadcom’s Wi-Fi chip; AppleBCMWLANCore and AppleBCMWLANBusInterfacePCIe.

Spending some time reverse-engineering the two drivers, it’s quickly evident what their corresponding roles are. AppleBCMWLANCore serves as a high-level driver, dealing mostly with configuring the Wi-Fi chip, handling incoming events, and chip-specific features such as offloading. In keeping with good design practices, the driver is unaware of the interface through which the chip is connected, allowing it to focus solely on the logic required to interact with the chip. In contrast, AppleBCMWLANBusInterfacePCIe, serves a complementary role; it is a low-level driver tasked with handling all the PCIe related communication protocols, dealing with MSI interrupts, and generally everything interface-related.

We’ll revisit the two drivers more in-depth later on, but for now it’s sufficient to say that we have a relatively good idea where to start looking for a potential TCM mapping -- after all, as we’ve seen, the TCM access is performed by mapping the PCIe BARs. Therefore, it would stand to reason that such an operation would be performed by AppleBCMWLANBusInterfacePCIe.

After reverse-engineering much of the driver, we come across a group of suspicious-looking functions that appear like candidates for TCM accessors. All the above functions serve the same purpose -- accessing a memory-mapped buffer, differing from one another only in the size of the word used (16, 32, or 64-bit). Anecdotally, the corresponding APIs for TCM access in the Android driver follow the same structure. What’s more, the above functions all reference the string “Memory”... We might be onto something!
Kernel Function 0xFFFFFFF006D1D9F0

Cross-referencing our way up the call-chain, it appears that all of the above functions are methods pertaining to instances of a single class, which incidentally bears the same name as that of the driver: AppleBCMWLANBusInterfacePCIe. Since several functions in the call-chain are virtual functions, we can locate the class’s VTable by searching for 64-bit words containing their addresses within the kernelcache.
To avoid unnecessary confusion between the object above and the driver, we’ll refer to the object for now on as the “PCIe object”, and we’ll refer to the driver by its full name; “AppleBCMWLANBusInterfacePCIe”.

Kernel Memory Analysis Framework


Now that we’ve identified mechanisms in the kernel possibly relating to the Wi-Fi chip’s TCM, our next course of action is to somehow access them. Had we been able to debug the iOS kernel, we could have simply placed a breakpoint on the aforementioned memory access functions, recorded the location of the shared buffer, and then used our debugger to freely access the buffer on our own. However, as it happens, iOS offers no such debugger. Indeed, having such a debugger would allow users to subvert the device’s security model...

Instead, we’ll have to create our kernel debugger!

Debuggers usually consist of two main pieces of functionality:
  1. The ability to modify the control flow of the program (e.g., by inserting breakpoints)
  2. The ability to inspect (and modify) the data being processed by the program

As it happens, modifying the kernel’s control flow on modern Apple devices (such as the iPhone 7) is far from trivial. These devices include a dedicated hardware component -- Apple’s Memory Cache Controller (AMCC), designed to prevent attackers from modifying the kernel’s code, even in the presence of full control over the kernel itself (i.e., EL1 code execution). While AMCC might make for an interesting research target in its own right, it’s not the main focus of our research at this time. Instead, we’ll have to make do with analysing and modifying the data processed by the kernel.

To gain access to the kernel, we’ll first need to exploit a privilege escalation vulnerability. Luckily, we can forgo all of the complexity involved in developing a functional kernel exploit, and instead rely on some excellent work by Ian Beer.

Earlier this year, Ian developed a fully-functional exploit allowing kernel code execution from any sandboxed process on the system. Upon successful execution, Ian’s exploit provides two primitives - memory-read and memory-write - allowing us to freely explore the kernel’s virtual address-space. Since the exploit was developed against iOS 10.2, we’ll need use the same version on our target iPhone to utilise it.

To allow for increased flexibility, we’ll aim to design our research platform to be modular; instead of tying the platform to a specific memory access mechanism, we’ll use Ian’s exploit as a “black-box”, only deferring memory accesses to the exploit’s primitives.

Moreover, it’s important that whatever system we build allows us to explore the device comfortably. Thinking about this for a moment, we can boil it down to a few basic requirements:
  1. The analysis should be done on a developer-friendly machine, not on the iPhone
  2. The platform should be scriptable and easily extensible
  3. The platform should be independent of the memory access mechanism used

To prevent any dependance on the memory access mechanism, we’ll implement a rudimentary command protocol, allowing clients to perform read or write operation, as well as offering an “execute” primitive for gadgets within the kernel’s VAS. Next, we’ll insert a small stub implementing this protocol into the exploit, allowing us to interface with the exploit as if it were a “black box”. As for the client, it can be executed on any machine, as long as it’s able to connect to the server stub and communicate using the above protocol.
A version of Ian Beer’s extra_recipe exploit with the aforementioned server stub can be found on our bug tracker, here.

Lastly, there’s the question of the research platform itself. For convenience sake, we’ve decided to develop the framework as a set of Python scripts, not unlike forensics frameworks such as Volatility. We’ll slowly grow the framework as we go along, adding scripts for each new data structure we come across.

Since the iOS kernel relies heavily on dynamic dispatch, the ability to explore the kernel in a shell-like interface allows us to easily resolve virtual call targets by inspecting the virtual pointers in the corresponding objects. We’ll use this ability extensively to assist our static analysis in place where the code is hard to untangle.

Over the course of our research we’ll develop several modules for the analysis framework, allowing interaction with objects within the XNU kernel, parts of IOKit, hardware components, and finally the Wi-Fi chip itself.

Setting Up a Test Network


Moving on, we’ll need to create a segregated test network, consisting of the target iPhone, a single MacBook (which we’ll use to interact with the iPhone), and a Wi-Fi router.

As our memory analysis framework transmits data over the network, both the iPhone and the MacBook must be able to communicate with one another. Additionally, as we’re using Xcode to deploy the exploit from the MacBook to the iPhone, it’d be advantageous if the test network allowed both devices to access the internet (so the developer profile could be verified).

Lastly, we require complete control over all aspects of our Wi-Fi router. This is since the next part of our research will deal extensively with the Wi-Fi layer. As such we’d like to reserve the ability to inject, modify and drop frames within our network -- primitives which may come in handy later on.

Putting the above requirements together, we arrive at the following basic topology:
In my own lab setup, the role of the Wi-Fi router is fulfilled by my ThinkPad laptop, running Ubuntu 16.04. I’ve connected two SoftMAC TL-WN722N dongles, one for each interface (internal and external). The internal network’s access-point is broadcast using hostapd, and the external interface connects to the internet using wpa_supplicant. Moreover, network-manager is disabled to prevent interference with our configuration.

Note that it’s imperative that the dongle used to broadcast the internal network’s access-point is a SoftMAC device (and not FullMAC) -- this will ensure that the MLME and MAC layers are processed by the host’s software (i.e., by the Linux Kernel and hostapd), allowing us to easily control the data transmitted over those layers.

The laptop is also minimally configured to perform IP forwarding and to serve as a NAT, in order to allow connections from the internal network out into the internet. In addition, I’ve set up both DNS and DHCP servers, to prevent the need for any manual configuration. I also recommend setting up DNS forwarding and blocking Apple’s software-update domains within your network (mesu.apple.com, appldnld.apple.com).

Depending on your work environment, it may be the case that many (or most) Wi-Fi channels are rather crowded, thereby reducing the signal quality substantially. While dropping frames doesn’t normally affect our ability to use the network (frames would simply be re-transmitted), it may certainly cause undesirable effects when attempting to run an over-the-air exploit (as re-transmissions may alter the firmware’s state substantially).

Anecdotally, scanning for nearby networks around my desk revealed around 60 Wi-Fi networks, causing quite a bit of noise (and frame loss). If you encounter the same issue, you can boost your RSSI by building a small cantenna and connecting it to your dongle:

Finding the TCM


Using our test network and memory analysis platform, let’s start exploring the kernel’s VAS!

We’ll begin the hunt by searching for the PCIe object within the kernel. After all, we know that finding the object will allow us to locate the suspect TCM mapping, bringing us closer to our goal of developing a Wi-Fi firmware debugger. Since we’re unable to place breakpoints, we’ll need to locate a “path” leading from a known memory location to that of the PCIe object.

So how will we identify the PCIe object once we come across it? Well, while the C++ standards do not explicitly specify how dynamic dispatch is implemented, most compilers tend to use the same ABI for this purpose -- the first word of every object containing virtual functions serves as a pointer to that object’s virtual table (commonly referred to as the “virtual pointer” or “vptr”). By leveraging this little tidbit, we can build our own object identification mechanism; simply read the first word of each object we come across, and check which virtual table it corresponds to. Since we’ve already located the VTable corresponding to the PCIe object we’re after, all we’d need to do is check each object against that address.

Now that we know how to identify the object, we can begin searching for it within the kernel. But where should we start? After all, the object could be anywhere in the kernel’s VAS. Perhaps we can gain some more information by taking a look at the the object’s constructor. For starters, doing so will allow us to find out which allocator is used to create the object; if we’re lucky, the object may be allocated from a special pool or stored in a static location.
Kernel Function 0xFFFFFFF006D34734

(OSObject’s “new” operator is a wrapper around kalloc - the XNU kernel allocator).

Looking at the code above, it appears that the PCIe object is not allocated from a special pool. Perhaps, instead, the object is addressable through data stored in the driver’s BSS or data segments? If so, then by following every “chain” of pointers originating in the above segments, we should be able to locate a chain terminating at our desired object.

To test out this hypothesis, let’s write a short python script to perform a depth-first search for the object, starting in the driver’s BSS and data segments. The script simply iterates over each 64-bit word and checks whether it appears to be a valid kernel virtual address. If so, it recursively continues the search by following the pointer and its neighbouring pointers (searching both forwards and backwards), stopping only when the maximal search depth is reached (or the object is located).
After running the DFS and following pointers up to 10 levels deep, we find no matching chain. It appears that none of the objects in the BSS or data segments contain a (sufficiently short) pointer chain leading to our target object.

So how should we proceed? Let’s take a moment to consider what we know about the object so far. First, the object is allocated using the XNU kernel allocator, kalloc. We also know the exact size of the allocation (3824 bytes). And, of course, we have a means of identifying the object once located. Perhaps we could inspect the allocator itself to locate the object...

On the one hand, it’s entirely possible that kalloc doesn’t keep track of in-use allocations. If so,  tracking down our object would be rather difficult. On the other hand, if kalloc does have a way of identifying past allocations, we can parse its data structures and follow the same logic to identify our object. To get to the bottom of this, let’s download the XNU source code corresponding to this version of iOS, and read through kalloc’s implementation.

After spending some time familiarising ourselves with kalloc’s implementation, we can sketch a high-level view of the allocator’s implementation. Since kalloc is a “zone allocator”, each allocated object is assigned a region from which it is drawn. Individual regions are represented by the zone_t structure, which holds all of the metadata pertaining to the zone.

The allocator’s operation can be roughly split into two phases: identifying the corresponding zone for each allocation, and carving the allocation from the zone. The identification process itself takes on three distinct flows, depending on the size of the requested allocation. Once the target zone is identified, the allocation process proceeds identically for all three flows.

So how are the allocations themselves performed? During zones’ lifetimes, they must keep track of the their internal metadata, including the zone’s size, the number of stored elements and many other bits and pieces. More importantly, however, the zone must track the state of the memory pages assigned to it. During the kernel’s lifetime, many objects are allocated and subsequently freed, causing the different zones’ pages to fill up or vacate. If each allocation triggered an iteration over all possible pages while searching for vacancies, kalloc would be quite inefficient. Instead, this is tackled by keeping track of several queues, each denoting the state of the memory pages assigned to the zone.

Among the queues stored in each zone are two queues of particular interest to us:
  • The “intermediate” queue - contains pages with both vacancies and allocated objects.
  • The “all used” queue -  contains pages with no vacancies (only filled with objects).

Putting it all together, we can identify allocated objects in kalloc by simply following the same mechanisms as those used by the allocator to locate the target zone. Once we find the matching zone, we’ll parse its queues to locate each allocation made within the zone, stopping only when we reach our target object.
Finally, we can package all of the above into a module in our analysis framework. The module allows us to either manually iterate over zones’ queues, or to locate objects by their virtual table (optionally accepting the allocation size to quickly locate the relevant zone).

Using our new kalloc module, we can search for the PCIe object using the VTable address we found earlier on. After doing so, we are finally greeted with a positive result -- the object is successfully located within the kernel’s VAS! Next, we’ll simply follow the same steps we identified in the memory accessors analysed earlier on, in order to extract the location of the suspected TCM mapping within the kernel.

Since the TCM mapping provides a view into the Wi-Fi chip’s RAM, we’d naturally expect it to begin with the same values as those we had identified in the RAM file extracted from the firmware. Let’s try and read out some of the values from the buffer and see whether it matches the RAM dump:
Great! So we’ve finally found the TCM. This brings us one step closer to acquiring the ROM, and to building a research environment for the Wi-Fi SoC.

Acquiring the ROM


The TCM mapping provides a view into the Wi-Fi chip’s RAM. While accessing the RAM is undoubtedly useful (as it allows us to gain visibility into the runtime structures used by the chip, such as the heap’s state), it does not allow us to directly access the chip’s ROM. So why did we go to all of this effort to begin with? Well, while thus far we have only used the mapped TCM buffer to read the Wi-Fi SoC’s RAM, recall that the same mapping also allows us to freely write to it -- any data written to the memory-mapped buffer is automatically reflected back to the Wi-Fi SoC’s RAM.

Therefore, we can leverage our newly acquired write access to the chip’s RAM in order to modify the chip’s behaviour. Perhaps most importantly, we can insert hooks into RAM-resident functions in the firmware, and direct their flow towards our own code chunks. As we’ve already built a patching infrastructure in the previous blog posts, we can incorporate the same code as a module in our analysis framework!

Doing so allows us to provide a convenient interface through which we simply select a target RAM function and provide a corresponding assembly stub, and the framework then proceeds to patch the function on our behalf, direct it into our shellcode to execute our hook (and emulate the original prologue), and finally return back to the original function. The shellcode stub itself is written into the top of the heap’s largest free chunk, allowing us to avoid overwriting any important data structures in the RAM.
Building on this technique, let’s insert a hook into a commonly invoked RAM function (such the the chip’s “ioctl” handler). Once invoked, our hook will simply copy small “windows” of the ROM into predetermined regions in RAM. Note that since the RAM is only slightly larger than the ROM, we cannot leak the entire ROM in one go, so we’ll have to resort to this iterative approach instead. Once a ROM chunk is copied, our shellcode stub signals completion, cause the host to subsequently extract the leaked ROM contents and notify the stub that the next chunk of ROM may be leaked.
Indeed, after inserting the hook and running the scheme detailed above, we are finally presented with a complete copy of the chip’s ROM. Now we can finally move on to analysing the firmware image!

To properly load the firmware into a disassembler, we’ll need to locate the ROM and RAM’s loading addresses, as well as their respective sizes. As we’ve seen in the past, the chip’s ROM is mapped at address zero and spans several KBs. The RAM, on the other hand, is normally mapped at a fixed, higher address.

There are multiple ways in which the RAM’s loading address can be deduced. First, the RAM blob analysed previously embeds its own loading address at a fixed offset. We can verify the address’s validity by attempting to load the RAM at this offset in a disassembler and observing that all the branches resolve correctly. Alternately, we can extract the loading address from the PCIe object we identified earlier in the kernel, as it contains both attributes as fields in the object.

Regardless, all of the above methods yield the same result -- the RAM is loaded at address 0x160000, and is 0xE0000 bytes long:

Building a Wi-Fi Firmware Debugger


Having extracted the ROM and achieved TCM access capabilities, we can also build a module to allow us to easily interact with the Wi-Fi chip. This module will act as a debugger of sorts for the Wi-Fi firmware, allowing us to gain full read/write capabilities to the Wi-Fi firmware, as well as providing several key debugging features.

Among the features present in our debugger are the abilities to inspect the heap’s freelist, execute assembly code chunks directly on the firmware, and even hook RAM-resident functions.

In the next blog post we’ll continue expanding the functionality provided by this module as we go along, resulting in a more complete research framework.

Wrapping Up


In this blog post we’ve performed our initial investigation into the Wi-Fi stack on Apple’s mobile devices. Using a privileged research platform to poke around the kernel, we managed to locate the Wi-Fi firmware’s TCM mapping in the host, and to extract the Wi-Fi chip’s ROM for further analysis. We also started fleshing out our research platform within the iOS kernel, allowing us to build our very own Wi-Fi firmware debugger, as well several modules for parsing the kernel’s structures -- useful tools for the next stage of our research!

In the next blog post, we’ll use our firmware debugger in order to continue our exploration of the Wi-Fi chip present on the iPhone 7. We’ll perform a deep dive into the firmware, discover multiple vulnerabilities and develop an over-the-air exploit for one of them, allowing us to gain full control over the Wi-Fi SoC.

10 comments:

  1. Very nice work, as always ;-)

    However, there is an easier way to extract the ROM from the Wi-Fi chip. As soon as, you have the RAM firmware file, you can patch it with the Nexmon Firmware Patching Framework (see https://nexmon.org) to hook the reference to the wlc_ioctl function that handles ioctls in the firmware. Hooking this function allows you to add new ioctls, for example some that can read from arbitrary memory locations, including the ROM. To send ioctls to the firmware, they have to be packed into an APPLE80211_IOC_CARD_SPECIFIC ioctl to the Wi-Fi interface as demonstrated by the monmob developers (see https://github.com/tuter/monmob/blob/master/tools/iOS/server/ioctl.py). On 64-bit devices, the apple80211req structure changed a bit so that it needs to be adjusted in tools sending this ioctl. For the Nexmon project, we created nexutil to send arbitrary ioctls to the firmware and also compiled it for iOS 10 (see https://github.com/seemoo-lab/nexmon/tree/master/ios_utilities/nexutil) using theos (see https://github.com/theos/theos). Equipped with nexutil and the extended RAM firmware, you can simply extract the ROM of your Wi-Fi chip. Currently, we only created the rom extraction project (see https://github.com/seemoo-lab/nexmon/tree/master/patches/bcm43451b1/7_63_43_0/rom_extraction) for the Wi-Fi chip of the iPhone 6, but we could also port it for the iPhone 7 if required.

    ReplyDelete
    Replies
    1. Hi Matthias,

      Thank you for the kind words! I am aware of the technique above, however, to utilise it you need privileged access to switch the firmware file (or am I missing something?). Once you have code execution on the chip (either by patching the RAM file itself or by patching RAM via the TCM), the rest is similar (hook a function and use it to extract the ROM).

      All the best,
      Gal.

      Delete
    2. Yes, you are right, privileged access is required at least to execute the /usr/libexec/wifiFirmwareLoader program to load the firmware from a given location and to send ioctls to the firmware. However, as long as a jailbreak exists for a certain iOS version, the Nexmon approach should always work to modify the firmware, reload it and extract the ROM. Nevertheless, the downside of this approach is, that addresses of structs may change when we allocate memory on the heap.

      Delete
  2. This has been an amazing read! Congratulations!

    I'll admit that I am a massive Apple fan and my knee-jerk reaction was negative, as to almost judge your intent, but after reading the article, I'm extremely happy you've made all this effort! Not only that, you've maintained a neutral bias throughout.

    Thank you! :)

    ReplyDelete
  3. This comment has been removed by the author.

    ReplyDelete
  4. dear gal your research is amazing but what more amazing is that your sharing it with us , so thank you much man

    ReplyDelete
  5. This comment has been removed by the author.

    ReplyDelete
  6. This comment has been removed by the author.

    ReplyDelete