There exist today a variety of debugging tools and methods to perform diagnostics for analyzing various types of problems with server applications. However, these conventional tools are complex to use, and thus, typically only trained support professionals can implement such tools to extract data for problem resolution. Complex cases, and those involving userdumps (a userdump is a file which contains a copy of the memory used by the process at the time of the failure) need to be funneled through the support professionals or development, since only these groups had the training required to use the debugging tools.
More specifically, when a tabular data stream (TDS) connection is available and working in the server (i. e., a “live” server), queries may be performed against virtual tables or system catalog tables. To look at persistent disk pages of the server, the live database is examined, and a dump or log is loaded. Database console command (DBCC) commands (e.g., PAGE or LOG) are then used to view the disk pages or log records.
When a TDS connection is not available (i.e., the server is hung), in many cases, an attempt is made to have the customer provide a complete userdump of the server process for post-mortem analysis. In some cases, retrieval of the userdump is attempted even if the server is not fully hung to obtain the maximum amount of information possible. In other cases, it may be requested of the customer to attach with a debugger, or run the server (or another component) under the debugger with an automated script to capture particular information based on some type of sequence. This procedure is more complicated than simply getting a userdump at a known point in time.
Querying the virtual tables or utilizing DBCC commands can only work against a live server, and are performed on an ad-hoc basis. To get several snapshots, a script must be configured to “poll” for this information. Some problems associated with live process debugging include when to start running the query or diagnostic command, how long to run it, or how often to capture the right data at just the right time. Furthermore, post-mortem analysis on a userdump file is only as good as the person using the debugger or existing debugger extensions written for the server. No method exists today to look at disk pages or log records “offline” (i.e., from a backup). For other processes, components, or services, there may be virtually no diagnostic capabilities aside from using a debugger.
Many memory structures or lists within the server are not available via a virtual table. A variety of DBCC commands are available to analyze the various server memory structures. DBCC commands can provide much valuable information, but face the same limitations as the virtual tables. However, the problem with pure DBCC or the extended logging approach is that invariably something gets overlooked or a bug introduces behavior that the built-in diagnostics do not cover. Moreover, the commands may not address all members of certain structures and cannot provide the associated output as a result set. No framework exists for using these tools based upon some event that occurs within the server.
Limitations of conventional diagnostics include the following. A complete userdump may be very resource intensive. If the server is not completely hung, customers are hesitant to perform a dump, especially for machines with large amounts of physically memory. Moreover, virtual tables (sysprocesses, syslockinfo, etc.) may not provide all the information needed such that some members of the structures on which these tables are based are not included for analysis. Creating new virtual tables or expanding these tables may cause compatibility issues when changes are needed for future versions.
What is needed is a debugging architecture with reduced complexity such that frontline personnel can implement and operate the architecture for more expedient problem resolution.