February 2001 /
Features /
Mark Russinovich
Crash Dump Analysis
Automatic Analysis with Kanalyze
After you've installed the symbols and tools, you're ready to perform crash dump analysis. Download and run BSOD, then generate a crash dump file that we can look at together. If you've performed a crash dump analysis in the past with the debugging tools from the NT 4.0 Setup CD-ROM or the Win2K Customer Support Diagnostics CD-ROM, you've probably used DumpChk to validate a dump file and DumpExam to generate a dump report. In the newer collections of debugging tools, DumpChk and DumpExam are obsolete. I start by describing their replacement, Kanalyze, a tool for automated crash dump analysis.
Kanalyze is a memory-dump analysis engine into which you load plug-in DLLs. You should know something about two of the several types of plug-ins that are available. One type locates and identifies items of interest in a dump; the other type analyzes the items that plug-ins of the first type found (some plug-ins fall into both categories). For example, some plug-ins locate and identify the memory locations of loaded drivers, allocated memory blocks, and I/O request packets. Corresponding analysis plug-ins ensure that the identified memory blocks account for all allocated memory, that loaded driver images don't differ from their on-disk versions, and that the I/O request packets are consistent. During the analysis phase, plug-ins note anomalies and conclusions (i.e., likely causes of a crash). Even during the item-location phase, however, Kanalyze notes any situation in which one item's memory area overlaps another item's memory area without being fully contained within it. For example, a driver's code resides entirely within an allocated memory block, so Kanalyze considers suspicious a situation in which driver code straddles two blocks or resides partially in an unallocated region.
The Kanalyze tool comes with documentation (accessible through the Kanalyze Help file) that lets third-party developers implement plug-in DLLs, but Kanalyze also bundles several Microsoft plug-in DLLs. For example, memory.dll identifies memory blocks, module.dll identifies driver code and data areas, and kobjects.dll examines a dump for kernel objects.
Another powerful Kanalyze feature is its ability to generate a signature ID file that provides important information about a crash and to store the file's data in a database. After completing an analysis, Kanalyze searches the database for other signature ID information that's similar to the information from the newly completed analysis. Thus, Kanalyze can detect crashes that result from the same cause so that you can identify trends or conclude that you might have implemented on another system a fix that you can apply to the system on which the latest crash took place. Kanalyze's database features require Microsoft SQL Server 7.0 or higher.
Because you install the OEM Support Tools by unzipping them, no menu shortcuts are available for running Kanalyze. Open a command prompt window, change directories to the directory in which you installed the OEM Support Tools, and type
Kanalyze
The Kanalyze wizard appears and guides you through the automated analysis process. The wizard requires you to specify the location of the memory dump you want to analyze and the location of the symbols. Unless you have SQL Server installed and want to use Kanalyze's crash database support, select the second radio button on the wizard's What would you like to do? page, which Figure 3, page 72, shows.
After you direct Kanalyze to the crash dump file, the wizard displays the crash dump's stop code and stop parameters (which Kanalyze calls BugCheck codes and parameters). A driver or kernel component that decides to crash the system uses the stop code to classify the reason that led to the decision. A crash you generate with the BSOD tool has a stop code of 0xD1 (DRIVER_IRQL_NOT_LESS_ OR_EQUAL) on Win2K and 0xA (IRQL_NOT_LESS_OR_EQUAL) on NT 4.0. Microsoft constantly updates its Knowledge Base to describe common causes of various stop codes and provide pointers to patches, driver updates, and workarounds. To find information about a particular stop code, type Stop and the stop code number in a search of the Knowledge Base. An example of the type of article a search might return is "Bugcheck 0x000000D1 Caused by DIc.sys" (http://support.microsoft.com/support/kb/articles/q266/2/21.asp), which explains how the Win2K Data Link Control (DLC) driver can cause a stop code of 0xD1 and directs you to a hotfix for the driver.
If you don't find a Knowledge Base article that matches your environment or crash scenario, continue to the next Kanalyze screen, which asks for the location of the symbols. Kanalyze loads the symbols for all the kernel modules it finds in the dump. This page informs you of missing symbols (third-party drivers don't usually include symbols) or symbols that don't match the loaded modules. Symbol mismatch warnings mean that the installed symbols might be outdated because of a service pack or hotfix installation.
The wizard's subsequent page tells you which plug-ins Kanalyze will load to perform the crash dump analysis. On the next screen, Kanalyze calls the DLLs in turn to locate items and analyze the resulting information. You can watch Kanalyze progress in phases. The wizard's [KA_START_LOCATE_ITEMS] phase reports the plug-ins that are looking for items in the crash dump, then the [KA_PERFORM_ANALYSIS] phase runs all the plug-ins that perform analysis. When the analysis phase is complete, Kanalyze waits for you to move to the final page of the wizard, which lets you view the analysis results.
When the View button in the Analysis conclusions area of the Results page isn't shaded, one or more plug-ins think that they have identified the cause of the crash. If you run Kanalyze on a dump that you used BSOD to generate, View is enabled, as Figure 4 shows. Clicking View displays a Namespace Browser window of identified problems, which Figure 5 shows. The window tells you that the STOPCODE plug-in thinks that the crashdd.sys driver produced the crash. The Namespace Browser window even shows you a stack trace (not visible in Figure 5) that tells you that the IopLoadUnload-Driver function in Ntoskrnl (the kernel) invoked a function in crashdd.sys and that crashdd.sys then in-voked KiTrap0E in Ntoskrnl. Whenever you see a function that contains the word Trap or Exception in a crash dump trace, you can bet that code in the kernel accessed an invalid pointer, crashing the system. In a BSOD-generated crash, crashdd.sys' access of invalid memory causes the trap function to be executed, so Kanalyze is right on the mark.
I don't recommend viewing the Anomalous conditions area of the wizard's Results page. Plug-ins conservatively identify as potentially unusual many situations that are not. The Results page also provides a View button for the Information from database area that lets you compare crash information with other information stored in a database. If you don't have SQL Server installed, you can't enable this functionality.
Advanced, the final button on the Results page, provides a view of detected items that you might use to do some manual analysis. Items are organized by type, and subitems reside underneath related items in the hierarchy that plug-ins define for their objects. For instance, the Module folder, which the Module plug-in generates, has subfolders for the memory regions that the driver and kernel code, data, and image header (which stores information about the composition of an image) occupy. The Process subfolder of the ExecutiveObject folder might be useful. Figure 6, page 74, shows this subfolder, which lists all the processes running on the system at the time of the crash and provides detailed information about each process' memory usage.
Manual Analysis with Kd
If Kanalyze fails to pinpoint the reason for a crash or at least provide useful hints, you can poke around the crash dump manually on the chance that you might spot something that Kanalyze missed. Two OEM Support Tools are available for manual analysis: WinDbg (often called Win Debug) and Kd (which earlier releases of the new debugging tools called i386kd). These tools have identical command sets and data-dumping capability, but WinDbg is a Windows application, whereas Kd is a command-line program. I recommend using WinDbg, which lets you easily copy values and use subwindows to simultaneously view more information.
To start WinDbg for crash dump analysis, type
windbg -z -y
at a command prompt. (If you've defined the _NT_SYMBOL_PATH variable, you can omit the y option.) WinDbg will run and present a view like the one that Figure 7, page 74, shows. You can now enter a number of debugging commands that will show you the state of various aspects of the system at the time of the crash. The debugging environment consists of three types of commands: built-in debugging commands, which have no prefix; dot commands, which have a dot (.) as a prefix; and bang commands, which have an exclamation point (!) prefix.
The most useful built-in debugging command is Dd, which dumps a range of memory. The dd esp command dumps what the stack-pointer register (aka the esp register) points at. However, unless you're familiar with x86 assembly language, esp dumps won't be useful. To access the online Help for the built-in debugging commands, use the ? command.
Find related articles
Find related products
|