In recent years, with a popularity of an Android system, attacks to an Android platform are increasing. In a third quarter of 2016, in China, newly increased number of malicious software packages intercepted by a majority of ROM security scan software based on AVLSDK services of Wuhan Antiy Information Technology CO., LTD exceeds 100 thousand every day. The number of malicious software packages attacking the Android platform accounts for 92% of the total and increases gradually. Therefore, the Android system faces serious security risks. Furthermore, a security study on smart phones has become a hotspot of the security studies around global. M. Miettinen and P. Halonen made a detailed analysis on the security threats confronted by mobile smart devices and main challenges and weaknesses in a security detection of the smart devices. Abhijit Bose etc. proposed a new detection model for detecting exceptions of smart phones. A difference between the study by Abhijit Bose and that by M. Miettinen and P. Halonen is that, an application running in a smart phone is taken as a target for detecting the exceptions of the smart phone in the new detection model. Behavior patterns of applications installed on the smart phone are described by a temporal logic based on causal relationships, and a SVM machine learning method is used to detect the exceptions. It may be seen that, existing security studies on the smart phone are focused on common exception detections based on behavior patterns. For a certain smart phone platform, such as the Android platform, the behavior patterns of malicious software (also called as malicious code rules) of the certain smart phone platform have not been studied and concluded.
It may be understood that, based on massive newly-added samples, it is important to fully solve a sieving problem. From a perspective of practice experiences, it is generally to solve the sieving problem by a reliable local virus detection engine (engine for short hereafter). That is, the engine filters known rules and samples to detect known malicious samples. Other samples difficult to be judged by the engine are thus detected via a machine learning method or a manual analysis again. In a long term, the existing method for detecting samples is less efficient. Therefore, for those malicious samples difficult to be judged, they should be analyzed in scientific and suitable technical means to infer the malicious code rules for optimizing the existing engine. It may be understood that, the more malicious code rules included in the engine, the higher detection efficiency obtained. Furthermore, well-defined malicious code rules used to optimize an existing engine are desired to satisfy both conditions: (1) a low false alarm rate (i.e., the rules extracted should not be too wide), (2) a high coverage rate (i.e., the rules extracted should completely cover suspected samples).