Embodiments of the present invention relate to system maintenance and diagnostics, and more particularly to techniques for preparing a package of diagnostic data for shipping to a diagnosis site for analysis.
Diagnosing defects in systems, such as Oracle database (DB) products, can be a complex and time-consuming task. In a complex software environment, the diagnostic data required to resolve an issue or problem can come from different sources and may be stored in multiple locations. For example, for a system comprising multiple components, the state of the various components may be held in different log files, diagnostic traces corresponding to the components may be stored in different repositories, and the like.
In a typical diagnostic flow, diagnostic data captured at system site (e.g., a customer site executing a product instance) is communicated to a diagnosis site (e.g., the site of the product vendor) for failure analysis. At the diagnosis site, the data received from the system site is analyzed to determine for example, occurrence of an error in the system, a root cause of the error, recommendations for mitigating effects of the errors, repair solutions to fix the error, and the like. The results of the analysis may be communicated from the diagnosis site to the system site.
However, due to the sheer amount of diagnostic data that may be captured for a system and the often disorganized manner in which the data is stored at the product site, it is often a difficult task to establish what diagnostic data is available at the system site and further what pieces of diagnostic data should be submitted to the vendor for analysis. If too little information is provided to the vendor, the amount of submitted data may be insufficient to perform a proper diagnosis of the error. The vendor then has to often contact the customer again and request additional information, some of which might no longer be available. Further analysis is possible only after receiving the additional requested information. This may take several back-and-forth communications between the customer and the vendor before the error can be diagnosed. On the other hand, sending too much diagnostic information is also problematic. The amount of data that is sent may include thousands of files and many gigabytes of data. Sending such a large volume of data to the diagnosis site is cumbersome, time consuming, and expensive. Further, if the data received at a diagnosis site is very large, it takes the vendor a long time to analyze the received diagnostic data to identify relevant pieces of data for analyzing a particular problem. Accordingly, under either scenario, the time needed to resolve the issue or problem is increased, leading to customer dissatisfaction.
Further, the diagnostic data that is communicated from the customer site to the vendor site may comprise information that may be considered sensitive or confidential by the customer. For instance, traces collected at a customer site may contain sensitive information such as network addresses or database schema details of the customer, export dumps may contain data from database tables storing sensitive or confidential information such as customer payroll details, etc. As a result, in the past, customers have been reluctant to allow communication of diagnostic data to vendor sites fearing disclosure of sensitive and confidential information. For example, banks have typically refused to send diagnostic data to a diagnosis site fearing that the data may contain information that is sensitive to the bank.
In light of the above, techniques are desired for improving the manner in which diagnostic data is identified and communicated from the system site or customer site to the vendor.