Posted by Natalie Silvanovich, Project Zero
This is a three-part series on exploiting messenger applications using vulnerabilities in WebRTC. This series highlights what can go wrong when applications don't apply WebRTC patches and when the communication and notification of security issues breaks down. Part 3 is scheduled for August 6.
Part 2: A Better Bug
In Part 1, I explored whether it was possible to exploit WebRTC using two memory corruption bugs in RTP processing. While I succeeded at moving the instruction pointer, I was not able to break ASLR, so I decided to look for vulnerabilities more suitable for this purpose.
I started off by going through WebRTC bugs I had filed in the past to see if any had the potential to break ASLR. Even if a bug was fixed long ago, it is an indicator of where similar bugs could potentially be found. One such bug was CVE-2020-6831, which is an out-of-bounds read in usrsctp.
usrsctp is an implementation of Stream Control Transmission Protocol (SCTP) used by WebRTC. Applications that use WebRTC can open data channels, which allow text or binary data to be transmitted from peer to peer. Data channels are often used to allow text messages to be exchanged during a video call, or to tell a peer when certain events have occurred, such as another peer disabling its camera. SCTP is the protocol that underlies data channels. In WebRTC, SCTP is analogous to RTP in that where RTP is used for audio and video content, SCTP is used for data.
I spent some time reviewing the usrsctp code for vulnerabilities. I eventually found CVE-2020-6831, which is a stack buffer overflow in usrsctp. This bug gives the attacker complete control of the size and contents of the overflow. Samuel Groß suggested that this bug could be used to break ASLR by overwriting the stack cookie, and then the return address one byte at a time, and detecting whether the value is correct based on whether the application crashes. Unfortunately, it turned out that this vulnerability is not reachable through WebRTC, as it requires a client socket to connect to a listening socket, meanwhile in WebRTC, both sockets are client sockets.
I kept looking and eventually found CVE-2020-6514. This is a rather unusual bug in how WebRTC interacts with usrsctp. usrsctp supports custom transports, in which case the integrator needs to provide the source and destination address for each connection as a pair of void pointers. The non-dereferenced value of these pointers is then used as an address by usrsctp, which means the value is included in some packets. In WebRTC, the address pointers are set to the address of the SctpTransport instance used by WebRTC. The result is that the location of this object in memory is sent to the remote peer during every SCTP connection. This is technically a bug in WebRTC, though the design of usrsctp is also flawed because using the type void* for custom addresses strongly encourages integrators to use pointers for this value even though this is insecure.
I was hoping this bug would be enough to break ASLR, but it turned out not to be. For an exploit, I needed the location of a loaded library, as well as the location of the heap, so I ran a series of tests on an Android device to see if there was any correlation between these locations, but there was not any. The location of a heap pointer was not enough to determine the location of a loaded library.
I kept looking, and I noticed a vulnerability in how usrsctp processes ASCONF chunks, which are used to manage dynamic IP addresses. The source for the bug is as follows.
Notice that the second call to sctp_m_freem is missing a return, so the m_ack variable can be used after it is freed. After finding this bug, I noticed that it had been patched in more recent versions of usrsctp and WebRTC. I later learned that it was reported by another Googler, Mark Wodrich as Bug 376 in usrsctp on September 19, 2019.
Revealing Memory with Bug 376
Two important questions in analyzing a use-after-free bug is what is freed, and how is it used. In Bug 376, the freed object is an mbuf structure, a type which is used to store the contents of inbound and outbound packets. The mbuf structure starts with a substructure, m_hdr, which is defined as follows.
Now, how is this structure used? Looking through the rest of the ASCONF handling, it is eventually added to an outbound packet queue to acknowledge the packet that was sent.
This made it very likely that this bug could be used to reveal memory of a remote peer if the freed m_buf structure was replaced with a structure with a pointer to memory continuing pointers, for example, the SctpTransport pointer revealed by CVE-2020-6514.
I tried to do this by sending RTP packets of the same size as the m_buf structure. There’s a nice trick for making a lot of allocations of a specific size that don’t get freed in WebRTC. Video packets get stored in a list before they are assembled into frames, so if the end of a frame is never sent, they will get stored forever, so long as a maximum number of packets is never hit. Unfortunately, this led to an unexpected problem. OpenSSL, which is used by WebRTC happened to have some heap allocations of the same size as an m_buf structure, and if they happened to be allocated in the place of the freed m_buf structure, they would get written to in the m_buf send process, which for some reason would lead to an irrecoverable state in OpenSSL. The application didn’t crash, it would just get stuck in some sort of loop and refuse to accept any more connections.
So I decided it would be better to allocate the memory replacing the m_buf structure in usrsctp. SCTP allows packets containing any number of chunks to be sent to a host, and in most cases they are processed as if they were a sequence of packets. Even better, the outbound packet queue that the freed m_buf structure is added to does not send any packets until all chunks in the current packet have been processed. This means that it should be possible to send a packet that contains a chunk that triggers the bug, and then a chunk that sets the freed memory to the needed values before it is sent back to the attacker. Since no network traffic needs to occur between when the m_buf structure is freed and when its memory is safely reallocated, this avoids the problem with OpenSSL.
Unfortunately, there are very few calls to malloc in usrsctp with sizes that are controllable by incoming traffic, and none of them allow the entire packet contents to be specified. The best I could find was in the processing of a data stream reset chunk. The code is as follows, with some parts removed for clarity.
This code allocates the liste structure, which can be used to replace the freed mbuf structure. It has one really lucky feature, which is that the next_resp property, which lines up with the mh_next property of the mbuf structure happens to be of the correct type, also mbuf. This would cause problems if it were another type, as usrsctp iterates through the entire mbuf chain before sending a packet.
A less lucky feature is that the properties that line up with the mh_data property of mbuf structure happen to be the current reset sequence number, and the transmission sequence number (TSN). These both are subject to a number of checks in this method. The reset sequence number needs to be exactly equal to the sequence number set when the connection was initialized, either in an INIT or COOKIE_ECHO chunk, and also needs to be equal to the lower four bytes of the SctpTransport pointer. This check can be passed by sending a COOKIE_ECHO chunk that sets the reset sequence number to the needed value before triggering the bug.
More challenging is the check that is performed on the TSN. It is compared to the cumulative TSN, which is originally set to the same value as the reset sequence number. The actual comparison performed is a ‘sequence number greater than’, which determines whether one value is ahead of or behind another value, assuming sequence numbers that roll over to zero when all bits are set. For example, if the current sequence number is 0xFFFFFFFF, the value 2 would pass a ‘sequence number greater than’ check, but the values 0xFFFFFFFE and 0x80000001 would fail. The TSN read out of the incoming packet has to be the top four bytes of the SctpTransport pointer, meanwhile the cumulative TSN has to be the bottom four bytes of this pointer because it is the same value as the reset sequence number. So this is actually a comparison between the two halves of the pointer. The TSN is a small number, less than 0x80 because it is the top of a pointer, so this comparison will return true roughly whenever bit 31 of the pointer is not set, and return the desired outcome of false roughly whenever it is set.
Bit 31 of the pointer is determined randomly by ASLR as well as where the SctpTransport instance is allocated on the heap, which means it is set about 50% of the time. Normally, I would be okay with an exploit being 50% effective, because that means it would probably succeed with a few tries, but in this case, that’s not true because it will have the tendency to fail again and again on the same ASLR layout. ASLR layout is determined when an Android device is started up, and doesn’t change again until it is rebooted. So I needed a way to change the cumulative TSN after the reset sequence number has been set.
It turns out that this is possible using the FWD_TSN chunk type, which allows a peer to request that another peer move its cumulative TSN up to 4096 bytes forward. It’s possible to move the cumulative TSN forward enough that bit 31 flips by sending this chunk type repeatedly. This takes quite a few chunks, but combining chunks into fewer packets and sending them as fast as possible, it can be flipped in a few seconds.
Putting this all together, the bug can be used to make the target device send back the memory of the SctpTransport instance, which contains a pointer to the class’s vtable, finally giving the location of the WebRTC library and breaking ASLR.
Thinking about it a bit, I didn’t think the WebRTC library would be the best library to use for my exploit, as it’s not unusual for WebRTC integrators to statically link it with other libraries and use all sorts of toolchains. It would be easier to know the location of libc, which comes from the Android system and has less variation. So I added a second usage of this bug that reads the location of malloc from the global offset table, which is a fixed offset from the SctpTransport vtable that has already been read. This allows the location of libc to be calculated.
Moving the Instruction Pointer (Again)
In Part 1, I figured out how to use an RTP memory corruption bug to move the instruction pointer, but after I filed CVE-2020-6514, Jann Horn suggested that it might be possible to use this bug to move the instruction pointer as well. When WebRTC uses the SctpTransport pointer as an address, it doesn’t just use it to identify the connection, but it actually casts the pointer to class SctpTransport, and makes virtual calls on it when sending outbound packets received from usrsctp.
Meanwhile, usrsctp usually determines the address for outbound packets based on identifiers in the packet, but there is one situation where it extracts the address from the packet itself: when processing COOKIE_ECHO chunks. Normally, it wouldn’t be possible to put an untrusted pointer in this chunk type, as are usually echoed from an incoming packet and need to be signed. However, Jann noticed that the random number generation for the signing key is very weak. The following code gets called when usrsctp is initialized.
The random number generator is then seeded by calling rand.
The INIT chunk sent when starting an SCTP connection contains a randomly generated key used for authentication, generated by the same random number generator used for the secret key. I wrote a script that determines the value of the remote PID based on this key, by calling srand on every number between 0 and 70 000, and seeing which one causes the random number generator to produce the same authentication key. It is then possible to infer the value of the secret key.
This key now allows the attacking device to send COOKIE_ECHO chunks with any contents, including changing the address to a custom pointer. This allows the instruction pointer to be moved, as a virtual call will be made on whatever address is provided the next time an outbound packet is sent, which happens immediately when the peer responds with a COOKIE_ACK. In the above section, I also discussed using COOKIE_ECHO packets to change the reset sequence number, while glossing over how I was actually sending them. It was using this same method.
I now had two possible methods for setting the instruction pointer in the exploit. I chose to move forward with this one, as it uses usrsctp, which is also necessary to break ASLR, meanwhile the RTP one uses a different feature. I felt that reducing the number of features needing to be enabled for this exploit to work would increase the number of applications it worked on, as sometimes applications disable specific WebRTC features.
Putting it All Together
Having all the necessary capabilities for an exploit, I then needed to put them all together. My general strategy was to make a fake object on the heap at a known location, and then make a virtual call on that object. The fake object would have a fake vtable in the same buffer that would point to system, which would run a shell command.
One missing piece is how to populate heap memory at a known location. One possibility was to use RTP to allocate memory of the same size as the SctpTransport object, hoping it gets allocated at the address directly after the object, or at a predictable location. I tried this, and it worked maybe 50% of the time, but considering I had a way to read memory, I thought I could do better.
I noticed that the SctpTransport class contains a CopyOnWriteBuffer object named partial_incoming_message_ that is sometimes used to store incoming SCTP data. SCTP supports data fragmentation, and usrsctp passes incomplete fragmented packets to WebRTC if they get above a certain size. These are stored in the partial_incoming_message_ object until the rest of the packet is received. So I thought if I sent the data for the fake object over SCTP to the target device, it would eventually populate this buffer, and I could read the address. (Note that this actually requires two reads, as there are two levels of indirection between a CopyOnWriteBuffer object and its backing data.)
I tried this, and it worked, but there was another problem. In order to create a fake object with a fake vtable, the fake object needed to reference itself, but this method only allowed me to know the location of the memory after it had been written to and couldn’t be changed. I looked a bit closer at how this functionality works. The code for setting the buffer is as follows.
What’s happening here is that incoming data is always immediately appended to the partial_incoming_message_ buffer, and then if it is an incomplete fragment, the function returns. Otherwise, it queues a thread to process the data, and then clears the buffer.
I started to wonder how clearing works, considering the data is still needed by the queued thread that might not be finished yet. It turns out that the CopyOnWriteBuffer class retains references to the data, and only deletes it if there are zero references left. Otherwise, it decrements the reference count and allocated new data of the current size for the buffer. This means it is possible to read the location of the _incoming_message_ buffer before data is written to it, because it is actually allocated during the clear. So long as the data written by AppendData is shorter or the same size as the largest size ever cleared, this memory will not be reallocated.
This allowed me to create a heap buffer at a known location and populate it. The last step was to figure out what to populate it with. I started out by filling it up with sequential numbers, and then using the address it crashed on to figure out what memory to change. After using the crash locations to create the fake vtable, I ended up with a crash on a branch to X8, and the only other controllable register was X21. X0 was of course set to the location of the fake vtable, as this crash was due to a virtual call, as were X1 and X23.
Astoundingly, libc had the perfect gadget for this situation.
Setting the value loaded in X23 to system, and copying a string parameter at an offset of 0x30 past the fake vtable caused system to be called with the parameter!
To give a quick overview, here are the steps required for the exploit, in order:
- The PID is determined based on the key in the INIT chunk, and then the secret key is determined
- The vtable is read from the SctpTransport object
- The location of malloc is read from the global offset table
- The partial_incoming_message_ buffer is populated with data of the needed size
- The partial_incoming_message_ buffer is cleared, so a new buffer is allocated
- The address of the partial_incoming_message_ buffer is read from the SctpTransport object
- The address of the partial_incoming_message_ backing buffer is read from the buffer structure
- The partial_incoming_message_ buffer is populated with exploit data, based on the location of of malloc
- The bug is triggered, making a virtual call to a gadget and then system
Now I had an exploit that worked in … the WebRTC sample Android application. Stay tuned for Part 3, where I explore what real Android applications the exploit works on.