Pages

Thursday, June 27, 2024

The Windows Registry Adventure #3: Learning resources

Posted by Mateusz Jurczyk, Google Project Zero

When tackling a new vulnerability research target, especially a closed-source one, I prioritize gathering as much information about it as possible. This gets especially interesting when it's a subsystem as old and fundamental as the Windows registry. In that case, tidbits of valuable data can lurk in forgotten documentation, out-of-print books, and dusty open-source code – each potentially offering a critical piece of the puzzle. Uncovering them takes some effort, but the payoff is often immense. Scraps of information can contain hints as to how certain parts of the software are implemented, as well as why – what were the design decisions that lead to certain outcomes etc. When seeing the big picture, it becomes much easier to reason about the software, understand the intentions of the original developers, and think of the possible corner cases. At other times, it simply speeds up the process of reverse engineering and saves the time spent on deducing certain parts of the logic, if someone else had already put in the time and effort.

One great explanation for how to go beyond the binary and utilize all available sources of information was presented by Alex Ionescu in the keynote of OffensiveCon 2019 titled "Reversing Without Reversing". My registry security audit did involve a lot of hands-on reverse engineering too, but it was heavily supplemented with information not coming directly from ntoskrnl.exe. And while Alex's talk discussed researching Windows as a whole, this blog post provides a concrete case study of how to apply these ideas in practice. The second goal of the post is to consolidate all collected materials into a single, comprehensive summary that can be easily accessed by future researchers on this topic. The full list may seem overwhelming as it includes some references to overlapping information, so the ones I find key have been marked with the 🔑 symbol. I highly recommend reviewing these resources, as they provide context that will be helpful for understanding future posts.

Microsoft Learn

Official documentation is probably the first and most intuitive thing to study when dealing with a new API. For Microsoft, this means the Microsoft Learn (formerly MSDN Library), a vast body of technical information maintained for the benefit of Windows software developers. It is wholly available online, and includes the following sections and articles devoted to the registry:

  • 🔑 Registry  – the main page about all registry-related subjects. It contains a wealth of knowledge, and is a must read for anyone deeply interested in this system mechanism. It is divided into three sections:
  • Windows registry information for advanced users – a separate article that discusses the principles of the registry. It appears to be somewhat outdated (the latest version mentioned is Windows Vista), and based on an old KB256986 article that can be traced back to at least 2004.
  • Inside the Windows NT Registry and Inside the Registry – two articles published by Mark Russinovich in the Windows NT Magazine in 1997 and 1999, respectively.
  • Windows 2000 Registry Reference – a web mirror of regentry.chm, an official help file bundled with the Windows 2000 Resource Kit. It includes a brief introduction to the registry followed by detailed descriptions of the standard registry content, i.e. keys and values used for advanced configuration of the system and applications.
  • Using the Registry Functions to Consume Counter Data – information about collecting performance data through the registry pseudo-keys: HKEY_PERFORMANCE_DATA, HKEY_PERFORMANCE_TEXT and HKEY_PERFORMANCE_NLSTEXT.
  • Offline Registry Library – complete documentation of the built-in Windows offreg.dll library, which can be used to inspect / operate on registry hives without loading them in the operating system.
  • Registry system call documentation, e.g. ZwCreateKey – a reference guide to the kernel-mode support of the registry, which reveals numerous details about how it works internally and how the high-level API functions are implemented under the hood.
  • Filtering Registry Calls – a set of eight articles detailing how to correctly implement registry callbacks as a kernel driver developer.

Blogs and online resources

Due to the fact that the registry stores a substantial amount of traces of user activity, it is a popular source of information in forensic investigations. As a result, a number of articles and blog posts have been published throughout the years, focusing on the internal hive structure, registry-related kernel objects, and recovering deleted data. Below is the list of non-official registry resources I have managed to find online, from earliest to latest:

  • WinReg.txt, author unknown (signed as B.D.) – documentation of the hive binary formats in Windows 3.x (SHCC3.10), Windows 95 (CREG) and Windows NT (regf) based on reverse engineering. It was likely the first public write-up outlining the undocumented structure of the hives.
  • Security Accounts Manager, author unknown (signed as clark@hushmail.com) – a comprehensive article primarily focused on the user management internals in Windows 2000 and XP, dissecting a number of binary structures used by the SAM component. Since user and credential management is highly tied to the registry (all of the authentication data is stored there), the article also includes a "Registry Structure" section that explains the encoding of regf hive files.
  • 🔑 Windows registry file format specification, Maxim Suhanov – a high-quality and relatively up-to-date specification of the regf format versions 1.3 to 1.6, with extra bits of information regarding the ancient versions 1.1 and 1.2.
  • Windows NT Registry File (REGF) format specification, Joachim Metz – another independently developed specification of the regf format associated with the libregf library.
  • Push the Red Button, Brendan Dolan-Gavitt (moyix) – a personal blog focused on security, reverse engineering and forensics. It contains a number of interesting registry-related posts dating back to 2007-2009.
  • Windows Incident Response, Harlan Carvey – a technical blog dedicated to incident response and digital analysis of Windows, with a variety of posts dealing with the registry published between 2006-2022.
  • My DFIR Blog, Maxim Suhanov – another blog concentrating on digital forensics with many mentions of the Windows registry. It provides some original information that's hard to find elsewhere, see e.g. Containerized registry hives in Windows.
  • Digging Up the Past: Windows Registry Forensics Revisited, David Via – a blog post by Mandiant discussing the recovery of data from registry hives and transactional logs.
  • Creating Registry Links and Mysteries of the Registry, Pavel Yosifovich – two blog posts covering the creation of symbolic links in the registry, and its overall internal structure.
  • Windows Registry, Wikipedia contributors – as usual, Wikipedia doesn't disappoint, and even though the article includes few deeply technical details, it features extensive sections on the history of the registry, its high level design and role in the system.

Furthermore, The Old New Thing is a fantastic, technical blog exploring the quirks, design decisions, and historical context behind Windows features. It is written by a Microsoft employee of over 30 years, Raymond Chen, with an astounding consistency of one post per day. While the blog posts are not technically documentation, they are very highly regarded in the community and can be considered a de-facto Microsoft knowledge resource – only more fun than Microsoft Learn. Over the course of the last 20+ years, Raymond would sometimes write about the registry, sharing interesting behind-the-scenes stories and anecdotes concerning this feature. I have tried to find and compile all of the relevant registry-related posts in the single list below:

Academic papers and presentations

Recovering meaningful artifacts from the registry during digital forensics is also a problem known in academia. To find relevant works, I often begin by typing the titles of a few known papers in Google Scholar, and then delve into a breadth-first search of their bibliographies. Here's what I managed to find pertaining to the registry:

Open-source software

To paraphrase a famous saying, source code is worth a thousand words. Sometimes it is far easier to grasp a concept or design by looking straight at code instead of reading an English explanation. And while the canonical implementation of the registry is the Windows kernel, a number of open-source projects have been developed over the years to operate on registry hives. They are typically either based on regf format analysis performed by the developer itself, or on existing documentation and other open-source tools. The three main reasons for their existence are a) computer forensics, b) emulating Windows behavior on other host platforms, c) directly accessing the SAM hive to change/reset local user credentials. Whatever the reason, such projects may prove useful in better understanding the internal hive format, and help in building proof-of-concept hives if necessary. A list of all the relevant open-source libraries and utilities I have found is shown below:

  • libregf – a library written in C with Python bindings,
  • hivex – a library written in C as part of the libguestfs project, with bindings for OCaml, Perl, Python and Ruby,
  • cmlib – a module implemented in C as part of ReactOS, which closely resembles the Windows implementation,
  • chntpw (The Offline Windows Password Editor) – a tool developed in C between 1997-2014 to manage Windows user passwords offline directly in the SAM hive. The registry-related code is located in ntreg.c (regf parser) and reged.c (a basic registry editor),
  • Samba – the Samba project includes yet another implementation of the Windows registry (under source3/registry and source4/lib/registry),
  • regipy – a Python registry hive parsing library and accompanying tools,
  • yarp – literally yet another registry parser (in Python),
  • Registry – a hive parser written in C#,
  • nt-hive – a hive parser written in Rust (with read-only capabilities),
  • Notatin – another hive parser written in Rust, including Python bindings and helper binaries.

Lastly, at the time of this writing, simply searching for some internal kernel function names on GitHub might reveal how certain functionality was implemented in Windows itself 20+ years ago.

SDK Headers

Header files distributed with Software Development Kits are an interesting case, because on one hand they are an official resource with information that Microsoft intends the developers to use, but on the other – they are a bit more concealed, as online documentation isn't always kept up to date with regards to their contents. We can thus explore their local copies on disk and sometimes find artifacts (function declarations, structure definitions, comments) that are not publicly documented online. Some of the headers most relevant to the registry are:

  • winreg.h (user-mode) – the primary registry header on the list, containing the prototypes of functions and structures from the official Registry API.
  • wdm.h (kernel-mode) – specifies a number of interesting constants/flags and types used by the system call interface of the registry, for example hive load flags (third argument of NtLoadKey2, such as REG_LOAD_HIVE_OPEN_HANDLE etc.) or key/value query structures (KEY_TRUST_INFORMATION, KEY_VALUE_LAYER_INFORMATION, etc.).
  • ntddk.h (kernel-mode) – contains some types not found elsewhere, e.g. KEY_LAYER_INFORMATION.
  • winnt.h (user-mode) – mostly equivalent to wdm.h.
  • winternl.h (user-mode) – contains the declarations of some registry-related system calls (NtRenameKey, NtSetInformationKey).

Security research

Learning about prior security research can be especially useful when starting a new project yourself. Not only does it often reveal deep technical details about the target, but it also comes from like-minded professionals who look at the code through a security lens, and may inspire ideas of further weaknesses or areas that require more attention. When it comes to the registry, I think that relatively little work has been done in the public space compared to its high complexity and the pivotal role it plays in the Windows operating system. Nevertheless, there were some materials that I found extremely insightful, especially those by my colleague James Forshaw from Project Zero. The full list of security-relevant resources I have managed to gather on this topic is shown below (including some references to my own publications from the past):

  • Case study of recent Windows vulnerabilities (2010), Gynvael Coldwind, Mateusz Jurczyk – a presentation on several security bugs Gynvael and I found during our brief registry research in 2009/2010.
  • Microsoft Kernel Integer Overflow Vulnerability (2016), Honggang Ren – a write-up on CVE-2016-0070, a Windows kernel vulnerability in the loading of malformed hive files.
  • Project Zero bug tracker (2016), James Forshaw, Mateusz Jurczyk – four bug reports submitted to Microsoft as a result of naive registry hive fuzzing.
  • Project Zero bug tracker (2014-2020), James Forshaw – 17 vulnerabilities related to the registry discovered by James, many of them are logic issues at the intersection of registry and other system mechanisms (security impersonation, file system).

Books

For a 20+ year old codebase such as the registry, it is expected that some resources covering it in the early days were published on paper rather than on the Internet. For this reason, part of my standard routine is to search Google Books for various technical terms and keywords related to the specific technology and see what pop ups. For the registry, these could be e.g. "regedit", "regf", "hbin", "LOG1", "RegCreateKey", "NtCreateKey", "HvAllocateCell", "\Registry\Machine", "key control block" and so on. In some cases this yields books with unique, strictly technical information, while in others the most insightful part is the historical perspective and being able to see how the given technology was perceived soon after it first came out. And sometimes the value of the book is a complete surprise until it arrives in the mailbox, as it is neither offered for sale as an ebook nor has preview available in Google Books, and so a hard copy is required.

The books that I found which are either fully or partially dedicated to the Windows registry are (latest to oldest):

Patents

Another useful source of information that may be otherwise difficult to find are patents, indexed by Google Patents. A particularly valuable result that I found this way is 🔑 Containerized Configuration (US20170279678A1), Microsoft's patent from 2016 that thoroughly explains the core concepts behind differencing hives and layered keys in registry. These mechanisms are part of a new feature introduced in Windows 10 Anniversary Update to better support containerization, but any official documentation of how it works is nowhere to be found. The patent is thus a great aid in understanding the intricate aspects of this new registry functionality, adding the necessary context and helping to make sense of otherwise highly cryptic kernel functions like CmpPrepareDiscardAndReplaceKcbAndUnbackedHigherLayers.

Manual analysis

So far, all of the resources we've discussed were accessible through a web browser, a text editor, or in physical form. But there is another type of information source that is equally, if not more, important, and that requires more specialized tooling to make sense of it. What I mean by that is the knowledge we can extract from the executable images in Windows responsible for handling the registry, both in terms of the "standard" reverse-engineering and also fully taking advantage of any helpful artifacts in or around them. I'll write more about the hands-on reversing process in upcoming posts, and now we will turn our attention to those artifacts that present us with clear-cut information without the need for deduction.

On a side note, by far the most essential file to be looking at is ntoskrnl.exe, the core NT kernel image. It contains the entirety of the kernel-space registry implementation and is of interest both from the security and functional perspective. I have personally spent 99% of my manual analysis time looking at that particular binary, but it's worth noting that there are a few other executables and libraries related to the registry as well:

  • winload.exe – the Windows Boot Loader, which executes before the Windows kernel. One of its responsibilities is to load the SYSTEM hive into memory and read some configuration from it, so it includes a partial copy of the registry code from ntoskrnl.exe.
  • offreg.dll – the Offline Registry Library, which also shares some registry code with the kernel (but executes in user-mode).
  • kernelbase.dll – one of the primary WinAPI libraries, implementing a majority of the user-space Registry API.
  • ntdll.dll – another core user-mode library which provides a bridge between the Registry API and the kernel registry implementation.
  • regsvc.dll – a DLL implementing the Remote Registry Service.

Let's investigate what types of information about the registry are readily available to us by running a disassembler/decompiler. I personally use IDA Pro + Hex-Rays and so the examples below are based on them.

🔑 Public symbols (PDB)

Microsoft makes public symbols available for a majority of executable images found in C:\Windows, for the benefit of developers and security researchers. By "public" symbols I mean PDB files that mainly contain the names of functions found in the binaries, which help in symbolizing system stack traces during debugging or in the event of a crash. In the past, the symbols used to be bundled with the system installation media or on a separate Resource Kit disc, and later they were available for download in the form of pre-packaged archives from the Microsoft website. Both of these channels have been deprecated, and currently the only supported way to obtain the symbols is on a per-file basis from the Microsoft Symbol Server. The PDB files can be downloaded directly with the official SymChk tool, or indirectly through software that supports the symbol server (e.g. IDA Pro, WinDbg).

As for ntoskrnl.exe specifically, its accompanying symbols are one of the most invaluable sources of information. As mentioned in an earlier post, the Windows kernel follows a consistent naming convention, so we can immediately see which internal routines are related to the registry, and where the entry points (registry-related system call handlers) that we might start our analysis from are. It shows us the extent of the code we are dealing with (1000+ registry functions) and makes it possible to perform analysis such as the one shown in blog post #2 (counting lines of code per system version) out-of-the-box, without doing any reverse engineering work. And perhaps most importantly, the function names make it substantially easier to reason about the code while doing the actual reversing, especially for functions with very descriptive names, like CmpCheckAndFixSecurityCellsRefcount or CmpPromoteSingleKeyFromParentKcbAndChildKeyNode.

Screenshot showing names and addresses

The other type of information we can find in the kernel debug symbols are types: enums, structures and unions. However, there are two caveats. First, only some types are included in the PDBs, and it's not clear what criteria Microsoft uses to decide whether to publish them or not. My rough estimate is that ~50% of the registry types can be found there, mostly the fundamental ones. Secondly, even though the prototypes of some types are in the symbols, neither the function arguments nor local variables are annotated with their types, so it is still necessary to determine the corresponding types and manually annotate the variables for the decompiled output to make any sense. Nevertheless, having access to this information is still a huge help both in understanding code on a local level and also grasping the bigger picture.

The structures that can be found in the public symbols are:

  • Hive descriptors (HHIVE, CMHIVE) and related structures
  • Hive bin and cell structures (HBIN, CM_KEY_NODE, CM_KEY_VALUE, CM_KEY_SECURITY, ...)
  • Key object related structures (CM_KEY_BODY, CM_KEY_CONTROL_BLOCK, ...)
  • Some transaction related structures (CM_TRANS, CM_KCB_UOW, ...)
  • Some layered-key related structures (CM_KCB_LAYER_INFO, ...)

Meanwhile, the ones that are missing and need to be manually reconstructed are:

  • The parse context and path information structures (as used by CmpParseKey)
  • Some transaction related structures (on-disk transaction log records, lightweight transaction object descriptors, ...)
  • Virtualization-related structures
  • Most layered-key related structures

Most of the relevant type names start with "CM", so it's easy to find them in the Local Types window in IDA:

Screenshot from IDA showing Local Types that start with CM


I would like to take this opportunity to thank Microsoft for making the symbols available for download, and encourage other vendors to do the same for their products. 🙂

Debug/Checked builds of Windows

Microsoft used to publish debug/checked builds of Windows (in addition to "free" builds) from Windows NT to early Windows 10. The difference between them was that the debug/checked builds had some compiler optimizations disabled, and they enabled extra debugging checks to identify internal system state inconsistencies as early as possible. The developers of kernel-mode drivers were encouraged to test them on debug/checked Windows builds before considering them as stable and shipping them to customers. Unfortunately, these special builds have been discontinued and don't exist anymore for the latest Windows 10 and 11.

These old builds can be quite valuable in the context of reverse engineering, because the extra checks may reveal some invariants and assumptions that the code makes, but which are not obvious when looking at retail builds. What is more, the checks are often verbose and include calls to functions like RtlAssert, DbgPrint, DbgPrintEx etc., passing a textual representation of the failed assertion, the source code file name and/or the line number. These may disclose the names of variables, structure members, enums, constants and other types of information. Let's see some examples:

DbgPrintEx(DPFLTR_CONFIG_ID, 24u, "\tImplausible size %lx\n", v13);

DbgPrintEx(DPFLTR_CONFIG_ID, 24u, "\tKey is bigger than containing cell.\n");

DbgPrintEx(DPFLTR_CONFIG_ID, 0, "invalid name starting with NULL on key %08lx\n", a3);

DbgPrintEx(DPFLTR_CONFIG_ID, 0, "invalid (ODD) name length on key %08lx\n", a3);

DbgPrintEx(DPFLTR_CONFIG_ID, 24u, "\tNo key signature\n");

DbgPrintEx(DPFLTR_CONFIG_ID, 0, "\tData:%08lx - unallocated Data\n", v20);

DbgPrintEx(DPFLTR_CONFIG_ID, 24u, "Class:%08lx - Implausible size \n", v20);

DbgPrintEx(DPFLTR_CONFIG_ID, 24u, "SecurityCell is HCELL_NIL for (%p,%08lx) !!!\n", a1, v67);

DbgPrintEx(DPFLTR_CONFIG_ID, 24u, "SecurityCell %08lx bad security for (%p,%08lx) !!!\n", v86, a1, v73);

DbgPrintEx(DPFLTR_CONFIG_ID, 0, "Root cell cannot be a symlink or predefined handle\n");

DbgPrintEx(DPFLTR_CONFIG_ID, 0, "invalid flags on root key %lx\n", v31);

DbgPrintEx(DPFLTR_CONFIG_ID, 24u, "\tWrong parent value.\n");


The CmpCheckKey function is responsible for verifying the structural correctness of every key in a newly loaded hive, and for every problem it encounters, it prints a more or less verbose message. This can help us better understand what each of these checks is intended to accomplish.

DbgPrintEx(DPFLTR_CONFIG_ID, 0, "CmKCBToVirtualPath ==> Could not get name even from parent KCB = %p!!!!\n", a1);


This message can be interpreted as some kind of a fallback mechanism failing when converting a registry path. It could indicate an interesting/brittle code construct, and indeed, the surrounding code did turn out to be affected by a 16-bit integer overflow and a resulting pool memory corruption (reported in Project Zero issue #2341). In consequence, the entire block of code (including the vulnerability) was removed, as it was functionally redundant and didn't serve any practical purpose.

RtlAssert("(*VirtContext) & CMP_VIRT_IDENTITY_RESTRICTED", "minkernel\\ntos\\config\\cmveng.c", 3554u, 0i64);


This single line of code in CmpIsSystemEntity reveals a few pieces of information: the name of the function argument (VirtContext), an internal name of a flag that is not documented in any other resources (CMP_VIRT_IDENTITY_RESTRICTED), and the source file name and line number of the expression (minkernel\ntos\config\cmveng.c:3554). Such information can be ported into our main disassembler database (such as an .idb) and later help us better understand other areas of code that use the same object/flags.

DbgPrintEx(DPFLTR_CONFIG_ID, 22u, "Error[1] %lx while processing CmLogRecActionDeleteKey\n", v12);


This and similar calls in CmpDoReDoRecord inform us of the internal names of the transaction record types (CmLogRecActionCreateKey, CmLogRecActionDeleteKey etc.), which again are not publicly mentioned anywhere else.

Debugging and experimentation

Poking and prodding the registry of a running Windows system is the last way of learning about it that comes to my mind. In some sense it is a required step, because we can only get so far by reading static documentation and code. At some point, we will be forced to investigate the real memory mappings corresponding to the hives, explore the contents of in-memory registry objects, or verify that a specific function behaves the way we think it does. Thankfully, there are some tools that make it possible to peek into the internal registry state beyond what the standard utilities like Regedit allow. They are briefly described in the sections below.

Extended Regedit alternatives

The built-in Regedit.exe utility offers quite basic functionality, and while it is adequate for most tinkering and system administration purposes, some third party developers have created custom registry editors with an extended set of options. I haven't personally used them so I cannot attest to their quality, but they may offer some benefits to other researchers. One example is Total Registry, whose main advantage is being able to browse the internal registry tree structure (rooted in \Registry) in addition to the standard high-level view with the five HKEY_* root keys.

Process Monitor

Process Monitor is a part of the Sysinternals suite of utilities, and is a widely known program for monitoring all file system, registry and process/thread activity in Windows in real time. Of course in this case, we are specifically interested in registry monitoring. For every operation taking place, we can see a corresponding line in the output window, which specifies the time, type of operation, originating process, registry key path, result of the operation and other details (all of this is highly configurable):

Process monitor screenshot as described above

ProcMon is a great tool for exploring what the registry is like as an interface, and how applications in the system use it. It is the most helpful when dealing with logical bugs, and attacking more privileged processes through the registry rather than attacking the registry implementation itself. For example, I used it to find a suitable exploitation primitive for Project Zero issue #2492, which allowed me to demonstrate that predefined keys were inherently insecure, leading to their deprecation. One of its advantages is that it works out-of-the-box without any special system configuration (other than the admin rights required to load a driver), and it's certainly a must have in a researcher's toolbox.

🔑 WinDbg and the !reg extension

WinDbg attached as a kernel debugger to a test (virtual) machine is the ultimate tool to explore the inner workings of the Windows kernel. I have used it extensively at every step of my research, to analyze how the registry works, reproduce any bugs that I found, and develop reliable proof-of-concept exploits. While its standard debugger functionality is powerful enough for most tasks, it also comes with a dedicated !reg extension that automates the process of traversing registry-specific structures and presents them in an accessible way. The full list of its options is shown below:

reg <command>      <params>       - Registry extensions

    querykey|q     <FullKeyPath>  - Dump subkeys and values

    keyinfo        <HiveAddr> <KnodeAddr> - Dump subkeys and values, given knode

    kcb        <Address>      - Dump registry key-control-blocks

    knode      <Address>      - Dump registry key-node struct

    kbody      <Address>      - Dump registry key-body struct

    kvalue     <Address>      - Dump registry key-value struct

    valuelist  <HiveAddr> <KnodeAddr> - Dumps list of values for a particular knode

    subkeylist <HiveAddr> <KnodeAddr> - Dumps list of subkeys for a particular knode

    baseblock  <HiveAddr>     - Dump the baseblock for the specified hive

    seccache   <HiveAddr>     - Dump the security cache for the specified hive

    hashindex  <HiveAddr> <conv_key>  - Find the hash entry given a Kcb ConvKey

    openkeys   <HiveAddr|0>   - Dump the keys opened inside the specified hive

    openhandles <HiveAddr|0>  - Dump the handles opened inside the specified hive

    findkcb    <FullKeyPath>  - Find the kcb for the corresponding path

    hivelist                  - Displays the list of the hives in the system

    viewlist   <HiveAddr>     - Dump the pinned/mapped view list for the specified hive

    freebins   <HiveAddr>     - Dump the free bins for the specified hive

    freecells  <BinAddr>      - Dump the free cells in the specified bin

    dirtyvector<HiveAddr>     - Dump the dirty vector for the specified hive

    cellindex  <HiveAddr> <cellindex> - Finds the VA for a specified cell index

    freehints  <HiveAddr> <Storage> <Display> - Dumps freehint info

    translist  <RmAddr|0>     - Displays the list of active transactions in this RM

    uowlist    <TransAddr>    - Displays the list of UoW attached to this transaction

    locktable  <KcbAddr|ThreadAddr> - Displays relevant LOCK table content

    convkey    <KeyPath>      - Displays hash keys for a key path input

    postblocklist             - Displays the list of threads which have 1 or more postblocks posted

    notifylist                - Displays the list of notify blocks in the system

    ixlock     <LockAddr>     - Dumps ownership of an intent lock

    finalize   <conv_key>     - Finalizes the specified path or component hash

    dumppool   [s|r]          - Dump registry allocated paged pool

       s - Save list of registry pages to temporary file

       r - Restore list of registry pages from temp. file


As we can see, the extension offers a wide selection of commands related to various components of the registry: hives, keys, values, security descriptors, transactions, notifications and so on. I have found many of them to be immensely useful, either on a regular basis (e.g. querykey, kcb, hivelist), or for more specialized tasks when experimenting with a particular feature (e.g. translist, uowlist for transactions).

The best way to discover its potential is to see it in action on a specific example. I used a Windows 11 guest system for this purpose. Let's query an existing HKEY_LOCAL_MACHINE\Software\DefaultUserEnvironment key to find out more about it:

kd> !reg querykey \Registry\Machine\Software\DefaultUserEnvironment

Found KCB = ffff888788731ad0 :: \REGISTRY\MACHINE\SOFTWARE\DEFAULTUSERENVIRONMENT

Hive         ffff88877af5c000

KeyNode      000001e6ed0334b4

[ValueType]         [ValueName]                   [ValueData]

REG_EXPAND_SZ       Path                          %USERPROFILE%\AppData\Local\Microsoft\WindowsApps;

REG_EXPAND_SZ       TEMP                          %USERPROFILE%\AppData\Local\Temp

REG_EXPAND_SZ       TMP                           %USERPROFILE%\AppData\Local\Temp


Here, we have referenced the key by its internal NT object manager registry path starting with \Registry. The relation between the high-level paths known from Regedit / the Registry API and the internal paths used by the kernel will be detailed in a future post – for now, we just need to know that these paths are equivalent. We can learn a few things from the command output: the key is cached in memory and the KCB (Key Control Block, represented by the CM_KEY_CONTROL_BLOCK structure) is located at address 0xffff888788731ad0. The address of the SOFTWARE hive descriptor is 0xffff88877af5c000, and that's where the HHIVE / CMHIVE structures are stored. HHIVE is the first member of CMHIVE at offset 0, hence why their addresses line up, similar to how the KPROCESS / EPROCESS structures work. Furthermore, the key node (CM_KEY_NODE), the definitive representation of a key within the hive file, is mapped at address 0x1e6ed0334b4. You may notice that this is a user-mode address, and that's because in modern versions of Windows, hive files are generally operated on via section-based mappings within the user address space of a thin "Registry" process (you can find it in Task Manager). Lastly, we can see that the key has three values and we are provided with their types, names and data.

Next, we can use !reg kcb to learn more about the key based on its cached KCB data:

kd> !reg kcb ffff888788731ad0

Key              : \REGISTRY\MACHINE\SOFTWARE\DEFAULTUSERENVIRONMENT

RefCount         : 0x0000000000000001

Flags            : CompressedName,

ExtFlags         :

Parent           : 0xffff88877ab517e0

KeyHive          : 0xffff88877af5c000

KeyCell          : 0xe824b0 [cell index]

TotalLevels      : 4

LayerHeight      : 0

MaxNameLen       : 0x0

MaxValueNameLen  : 0x8

MaxValueDataLen  : 0x66

LastWriteTime    : 0x 1d861d2:0xdb7718d1

KeyBodyListHead  : 0xffff888788731b48 0xffff888788731b48

SubKeyCount      : 0

Owner            : 0x0000000000000000

KCBLock          : 0xffff888788731bc8

KeyLock          : 0xffff888788731bd8


This is a summary of some of the KCB components that the author of the extension deemed the most important. We can see the value of the reference count, flags shown in textual form, the KCB address of the key's parent, the address of the hive, etc. Let's resolve the virtual address of the key node by using !reg cellindex:

kd> !reg cellindex 0xffff88877af5c000 0xe824b0

Map = ffff88877ec20000 Type = 0 Table = 7 Block = 82 Offset = 4b0

MapTable     = ffff88877ec37000 

MapEntry     = ffff88877ec37c30 

BinAddress = 000001e6ed033001, BlockOffset = 0000000000000000

BlockAddress = 000001e6ed033000 

pcell:  000001e6ed0334b4


The result is 0x1e6ed0334b4, the same value that !reg querykey returned to us earlier. In order to inspect the contents of the key node, we can use !reg knode:

kd> !reg knode 1e6ed0334b4

Signature: CM_KEY_NODE_SIGNATURE (kn)

Name                 : DefaultUserEnvironment

ParentCell           : 0x20

Security             : 0x98f300 [cell index]

Class                : 0xffffffff [cell index]

Flags                : 0x20

MaxNameLen           : 0x0

MaxClassLen          : 0x0

MaxValueNameLen      : 0x8

MaxValueDataLen      : 0x66

LastWriteTime        : 0x 1d861d2:0xdb7718d1

SubKeyCount[Stable  ]: 0x0

SubKeyLists[Stable  ]: 0xffffffff

SubKeyCount[Volatile]: 0x0

SubKeyLists[Volatile]: 0xffffffff

ValueList.Count      : 0x3

ValueList.List       : 0xe825a8


A very similar effect can be achieved by finding the Registry process, switching to its context, and inspecting the memory directly by overlaying it onto the CM_KEY_NODE structure layout:

kd> !process 0 0

**** NT ACTIVE PROCESS DUMP ****

PROCESS ffffe30198ef5040

    SessionId: none  Cid: 0004    Peb: 00000000  ParentCid: 0000

    DirBase: 001ae002  ObjectTable: ffff88877a285f00  HandleCount: 3302.

    Image: System

PROCESS ffffe30198fe1080

    SessionId: none  Cid: 0040    Peb: 00000000  ParentCid: 0004

    DirBase: 1002c002  ObjectTable: ffff88877a277b40  HandleCount:   0.

    Image: Registry

[...]

kd> .process ffffe30198fe1080

Implicit process is now ffffe301`98fe1080

WARNING: .cache forcedecodeuser is not enabled

kd> dt _CM_KEY_NODE 1e6ed0334b4

nt!_CM_KEY_NODE

   +0x000 Signature        : 0x6b6e

   +0x002 Flags            : 0x20

   +0x004 LastWriteTime    : _LARGE_INTEGER 0x01d861d2`db7718d1

   +0x00c AccessBits       : 0x3 ''

   +0x00d LayerSemantics   : 0y00

   +0x00d Spare1           : 0y00000 (0)

   +0x00d InheritClass     : 0y0

   +0x00e Spare2           : 0

   +0x010 Parent           : 0x20

   +0x014 SubKeyCounts     : [2] 0

   +0x01c SubKeyLists      : [2] 0xffffffff

   +0x024 ValueList        : _CHILD_LIST

   +0x01c ChildHiveReference : _CM_KEY_REFERENCE

   +0x02c Security         : 0x98f300

   +0x030 Class            : 0xffffffff

   +0x034 MaxNameLen       : 0y0000000000000000 (0)

   +0x034 UserFlags        : 0y0000

   +0x034 VirtControlFlags : 0y0000

   +0x034 Debug            : 0y00000000 (0)

   +0x038 MaxClassLen      : 0

   +0x03c MaxValueNameLen  : 8

   +0x040 MaxValueDataLen  : 0x66

   +0x044 WorkVar          : 0

   +0x048 NameLength       : 0x16

   +0x04a ClassLength      : 0

   +0x04c Name             : [1]  "敄"


In the listing above, we can see the full extent of information stored in the hive for each key. The name in the last line is incorrectly displayed as 敄, because formally the type of CM_KEY_NODE.Name is wchar_t[1], but since the name consists of ASCII-only characters, it is compressed down so that each wchar_t element stores two characters of the name (as indicated by the flag 0x20 translated by WinDbg as CompressedName). So 敄 is in fact the two first letter of the name, "De", represented as a UTF-16 code point.

This is only a glimpse of what is possible with WinDbg and the !reg extension. I highly encourage you to experiment with other options if you're curious about the mechanics of the registry and want to explore further.

Conclusion

In this post, I have aimed to share my methodology for gathering information and learning about new vulnerability research targets. I hope that you find some of it useful, either as a generalized approach that applies to other software, or as a comprehensive knowledge base for the registry itself. Also, if you think I've missed any resources, I'll be more than happy to learn about them. See you in the next post!