Thursday, October 3, 2024

Effective Fuzzing: A Dav1d Case Study

Guest post by Nick Galloway, Senior Security Engineer, 20% time on Project Zero



Late in 2023, while working on a 20% project with Project Zero, I found an integer overflow in the dav1d AV1 video decoder. That integer overflow leads to an out-of-bounds write to memory. Dav1d 1.4.0 patched this, and it was assigned CVE-2024-1580. After the disclosure, I received some questions about how this issue was discovered, since dav1d is already being fuzzed by at least oss-fuzz. This blog post explains what happened. It’s a useful case study in how to construct fuzzers to exercise as much code as possible. But first, some background...

Background

Dav1d

Dav1d is a highly-optimized AV1 decoder. AV1 is a royalty-free video coding format developed by the Alliance for Open Media, and achieves improved data compression compared to older formats. AV1 is widely supported by web browsers, and a significant parsing vulnerability in AV1 decoders could be used as part of an attack to gain remote code execution. In the right context, where AV1 is parsed in a received message, this could allow a 0-click exploit. Testing some popular messaging clients by sending AV1 videos and AVIF images (which uses the AV1 codec) yielded the following results:

  • AVIF images are displayed in iMessage
  • AVIF images are NOT displayed in Android Messages when sent as an MMS
  • AVIF images are displayed in Google Chat
  • AV1 videos are not immediately displayed in Google Chat, but can be downloaded by the receiver and eventually can be played after being downscaled

Dav1d is written primarily in C and notably has different code paths for different architectures. There are x86, x86-64, ppc, riscv, arm32, and arm64 code paths in the repository, most of these containing optimized assembly. As noted in their roadmap, support for some of these is ongoing work, but at least ARMv7, ARMv8, and x86-64 have been thoroughly tested in the field. Based on this being a library written in C and assembly, as well as dav1d’s ubiquitous support in web browsers, I might expect it already has excellent fuzzing coverage from multiple sources.

The integer overflow

The full details, including two proof-of-concepts that can be used to reproduce the vulnerability, are available from the project zero bug tracker. The short explanation is that when multiple decoding threads are used, a signed 32-bit integer overflow can occur when calculating the values to put in the tile start offset array. In the excerpt below, the addition overflows:

f->frame_thread.tile_start_off[tile_idx++] = row_off + b_diff *

  f->frame_hdr->tiling.col_start_sb[tile_col] * f->sb_step * 4;

These overflowed values in tile_start_off are then passed to setup_tile():

setup_tile(&f->ts[j], f, data, tile_sz, tile_row, tile_col++,

                       c->n_fc > 1 ? f->frame_thread.tile_start_off[j] : 0);

The tile_start_off parameter to setup_tile() is from f->frame_thread.tile_start_off[j] above, and used to calculate values for several pointers. (Note that pal_idx, cbi, and cf are pointers in the frame_thread struct, as can be seen in internal.h.

static void setup_tile(Dav1dTileState *const ts,

                       const Dav1dFrameContext *const f,

                       const uint8_t *const data, const size_t sz,

                       const int tile_row, const int tile_col,

                       const int tile_start_off)

...

       ts->frame_thread[p].pal_idx = f->frame_thread.pal_idx ?

            &f->frame_thread.pal_idx[(size_t)tile_start_off * size_mul[1] / 8] :

            NULL;

        ts->frame_thread[p].cbi = f->frame_thread.cbi ?

            &f->frame_thread.cbi[(size_t)tile_start_off * size_mul[0] / 64] :

            NULL;

        ts->frame_thread[p].cf = f->frame_thread.cf ?

            (uint8_t*)f->frame_thread.cf +

                (((size_t)tile_start_off * size_mul[0]) >> !f->seq_hdr->hbd) :

            NULL;

Those pointers are later written to, resulting in an out of bounds write to memory. Two test cases are provided with the bug, the first of which (poc1.obu) will result in an address which is outside the valid range of addresses, and so might not be exploitable. The other test case (poc2.obu) enables high bit depth mode and so has higher memory requirements, but results in pointers that are within the normal range of addresses, and so is more likely to be useful in an exploit.

Fuzzing Space Definition

A fuzzer’s success is typically measured by “coverage”, where the fuzz target's execution is traced to examine which lines of assembly code have been covered. When I talk about the “fuzzing space”, I specifically mean space in the sense of a mathematical space, where the set of lines of code that are executed by a given set of test cases is something we would like to maximize. In other words, a good fuzzer will execute as many lines of code as possible with the smallest possible set of test cases. To fully define the space we would also consider the fuzzing engine that generates test cases, the initial seed corpus, and the various configurations and architectures supported by the code to be fuzzed.

Modified Dav1d Fuzzer

The dav1d fuzzer in oss-fuzz at the time I was looking at dav1d is visible on GitHub. This contains build instructions and a dockerfile for oss-fuzz to run this at scale. The fuzzer implementation is in the dav1d source repository. The meson.build file shows a couple of configurations, one for building dav1d_fuzzer and the other for building dav1d_fuzzer_mt, which additionally defines DAV1D_MT_FUZZING.

The fuzzing code is written in C, found in dav1d_fuzzer.c. The fuzzer implements LLVMFuzzerTestOneInput, the standard way to use libfuzzer. The first thing the fuzzer does is the usual variable declarations in any C function, including instantiating a Dav1dSettings struct to all zeroes. A bit later, the fuzzer uses a function to initialize the settings struct with defaults:

dav1d_default_settings(&settings);

#ifdef DAV1D_MT_FUZZING

    settings.max_frame_delay = settings.n_threads = 4;

#elif defined(DAV1D_ALLOC_FAIL)

    settings.max_frame_delay = max_frame_delay;

    settings.n_threads = n_threads;

    dav1d_setup_alloc_fail(seed, probability);

#else

    settings.max_frame_delay = settings.n_threads = 1;

#endif

#if defined(DAV1D_FUZZ_MAX_SIZE)

    settings.frame_size_limit = DAV1D_FUZZ_MAX_SIZE;

#endif

It’s good that the fuzzer will create one or four threads, depending on the fuzzer configuration, but if vulnerabilities exist only when there are three threads, or 19 threads, these will not be detected by any use of this fuzzer. That said, since the code paths for the threaded option mostly seem to differ based only on whether the number of threads is 1 or some other number, that seems unlikely.

There are some other configuration items that users of the dav1d library might configure differently. As one example, output_invisible_frames is always zero in this fuzzer. If a vulnerability existed only when this was nonzero, the fuzzer would not catch this in either the threaded or multi threaded fuzzing configurations.

Another example of untested coverage in the dav1d fuzzer, and the most interesting one for me because it led to the discovery of the integer overflow vulnerability, is the usage of DAV1D_FUZZ_MAX_SIZE.

#define DAV1D_FUZZ_MAX_SIZE 4096 * 4096

#if defined(DAV1D_FUZZ_MAX_SIZE)

    settings.frame_size_limit = DAV1D_FUZZ_MAX_SIZE;

#endif

This maximum frame size did not exist in all configurations used by dav1d users, and although 32-bit platforms have an internally applied limit, there is no limit by default for 64-bit platforms. Removing this line (and using the multithreaded fuzzer with ubsan enabled) was enough to trigger the integer overflow. To allow the fuzzer to explore more of the configuration space, I added support for the fuzzer to also fuzz many of the configuration settings that can be passed to dav1d. The code for this "configuration fuzzing"  is shown in the excerpt below. It contains a few range restrictions only to avoid triggering asserts. This might be an area to explore in the future in case any of these asserts do not occur in production systems and where the configuration can be influenced by an attacker.

struct SettingsFuzz {

  int n_threads;

  int max_frame_delay;

  int apply_grain;

  int operating_point;

  int all_layers;

  unsigned frame_size_limit;

  int strict_std_compliance;

  int output_invisible_frames;

  int inloop_filters;

  int decode_frame_type;

};

Dav1dSettings newSettings(struct SettingsFuzz sf) {

  Dav1dSettings settings = {0};

  dav1d_default_settings(&settings);

  // Some of these trigger an assert if they're out of range

  if (sf.n_threads < 0 || sf.n_threads > DAV1D_MAX_THREADS) {

    sf.n_threads = DAV1D_MAX_THREADS / 2;

  }

  settings.n_threads = sf.n_threads;

  if (sf.max_frame_delay < 0 || sf.max_frame_delay > DAV1D_MAX_FRAME_DELAY) {

    sf.max_frame_delay = DAV1D_MAX_FRAME_DELAY;

  }

  settings.max_frame_delay = sf.max_frame_delay;

  settings.apply_grain = sf.apply_grain;

  if (sf.operating_point < 0 || sf.operating_point > 31) {

    sf.operating_point = 0;

  }

  settings.operating_point = sf.operating_point;

  settings.all_layers = sf.all_layers;

  settings.frame_size_limit = sf.frame_size_limit;

  settings.strict_std_compliance = sf.strict_std_compliance;

  settings.output_invisible_frames = sf.output_invisible_frames;

  settings.inloop_filters = (enum Dav1dInloopFilterType)sf.inloop_filters;

  if (sf.decode_frame_type < 0 ||

      sf.decode_frame_type > (int)DAV1D_DECODEFRAMETYPE_KEY) {

    sf.decode_frame_type = 0;

  }

  settings.decode_frame_type = (enum Dav1dDecodeFrameType)sf.decode_frame_type;

  return settings;

}

Instead of placing limits on the fuzzer, these are better placed in the code itself. Taking the maximum frame size, on 32-bit systems dav1d will restrict frame sizes to 8192*8192, regardless of the frame_size_limit configuration. (See code excerpt below) On 64-bit systems, there is no such limit, and so very large 50,000x50,000 sized frames are possible.

    /* On 32-bit systems extremely large frame sizes can cause overflows in

     * dav1d_decode_frame() malloc size calculations. Prevent that from occuring

     * by enforcing a maximum frame size limit, chosen to roughly correspond to

     * the largest size possible to decode without exhausting virtual memory. */

    if (sizeof(size_t) < 8 && s->frame_size_limit - 1 >= 8192 * 8192) {

        c->frame_size_limit = 8192 * 8192;

        if (s->frame_size_limit)

            dav1d_log(c, "Frame size limit reduced from %u to %u.\n",

                      s->frame_size_limit, c->frame_size_limit);

    }

There’s an understandable desire to avoid reporting errors when a fuzzer triggers huge allocations. When possible, this part of the fuzzing space could be explored with a shard configured to run with a much larger amount of memory than is usually available.

I also tested a number of other avenues that did not lead to discovering vulnerabilities. One example is fuzzing on ARM, which I had expected might result in a vulnerability due to it not being covered by OSS-Fuzz. Despite this not uncovering anything, I still believe that it’s worthwhile to run fuzz tests on other architectures when possible, especially when the target has different code paths and optimized assembly for different architectures, as is the case with dav1d.

Conclusion

The ultimate lesson I took away from this is that a fruitful area to look for vulnerabilities is artificial limits within a fuzzer. By setting a relatively small frame_size_limit, the dav1d fuzzer missed the integer overflow. There is a good reason for this limit, which is that oss-fuzz only supports 2.5GB of RAM. This highlights a tradeoff for fuzzers. By limiting the amount of RAM we can hope to increase coverage overall by fitting more fuzzers within the machines we have. Unfortunately this means limited coverage for the part of the fuzzing space that requires more memory.

Until memory safe parsers are available and widely used, memory corruption issues will continue to present a serious threat to users. For now, perhaps we can create fuzzers that are configured to occasionally explore parts of the fuzzing space that require more memory.

P.S.: OSS-Fuzz Bughunters Reward Program

Finally, I would like to mention that as of writing there is a Google bughunters reward program for improving fuzzing coverage in critical OSS projects. See the bughunters site for more details.