1. Technical Field
The present invention relates to a method and system for monitoring and controlling applications executing in a computing node of a computing system, and more particularly to a technique for monitoring and controlling a plurality of applications in a computing node of a distributed computing system, where one or more applications of the plurality of applications are untrusted applications.
2. Related Art
Conventional process monitoring tools do not include adequate built-in sandboxing features to allow proper execution of unreliable code in a distributed or clustered computing system, where the code is not tested or not exhaustively tested. Insufficient testing of code is commonplace in a text analytics platform such as the WebFountain cluster, due to the difficulty of simulating the complex computing environment. The WebFountain cluster is a large text analytics platform, which includes applications that provide crawling of the Internet, storage and access of the data resulting from the crawling, and indexing of the data. Further, inadequately tested code in such a complex computing environment leads to Byzantine faults that are not sufficiently protected against by known monitoring tools. A Byzantine fault is an arbitrary failure mode characterized by the erroneous, inconsistent and potentially malicious behavior of system components. Still further, known monitoring tools do not ensure, in a programmatic manner, that a failure of an unreliable child application that causes the child's parent application to also fail does not adversely affect critical components in the rest of the computing system (e.g., by causing or facilitating a failure of other child applications of the failed parent application). Thus, there exists a need to overcome at least one of the preceding deficiencies and limitations of the related art.