The present invention relates to distributed computer systems, and more specifically to failure data for distributed computer systems.
Enterprise class computer systems typically include a larger number of controllers (e.g., hundreds) that are each responsible for managing a set of hardware (e.g., one or more computer systems) of the systems. When the controllers and/or the managed hardware experience a failure, the controllers may gather diagnostic data regarding the event and send it to higher level systems, at which it may be analyzed and/or reported. This data may be used to diagnose the causes of errors and, in some cases, reduce the need to reproduce errors.