The concept of regular expression is a computer technology concept. The regular expression uses a single character string to describe and match a series of character strings which satisfy a certain syntactic rule. In many text editors, the regular expression is usually used for searching and replacing texts which satisfy a certain mode.
The regular expression is generated generally through a regular expression generation tool. The existing regular expression generation tool includes Txt2re. Txt2re provides a plurality of text item selection buttons. Through a certain text item selection button, Txt2re tool executes a processing corresponding to rules. Such processing usually includes extracting a partial character string with a corresponding rule from a character string, and generating a code corresponding to the regular expression through the extracted partial character string. The specific steps for generating a regular expression of a log by using Txt2re include:
S1: Txt2re tool receives an input log through a textbox. Provided that the log is:
10.200.98.220 - - [28/Jun/2013:14:53:08 +0800] ″POST /PutData?Category=YunOsAccountOpLog&AccessKeyId=U0UjpekFQOVJW45A&Date=Fr i%2C%2028%20Jun%202013%2006%3A53%3A30%20GMT&Topic=raw&Signature=pD1 2XYLmGxKQ%2Bmkd6x7hAgQ7b1c%3D HTTP/1.1″ 0.024 18204 200 37 ″-″ ″aliyun-sdk-java″
S2: Txt2re tool traverses each character in the input log, recognizes character strings with a certain rule in the log, and generates a regular expression mark corresponding to each character string in the log by a preset rule.
For example, Txt2re tool traverses each character in the input log. After the characters “10.200.98.220” are traversed, it is determined that this character string is an IPv4 address by the preset rule, and a regular expression mark “ip address” which represents this field is generated.
S3: Txt2re tool receives an instruction of clicking on a “show matches” button in an interface, in response to the instruction displaying the character string in the log and the corresponding regular expression mark, and provides a click command on the regular expression mark. As shown in FIG. 1, by receiving the clicked “show matches” button, Txt2re tool displays an area. In this area, the log content and the regular expression mark corresponding to the recognized character string in the log content are respectively displayed in two lines. In addition, a clickable button is provided on the regular expression mark.
S4: Txt2re tool receives and clicks on the button marked as “ip address” to process the corresponding character string “10.200.98.220” and generates a code corresponding to the regular expression of the character string.
Then, an operator may convert the code into the regular expression corresponding to the field manually or by using other tools.
Thus, Txt2re tool provides a plurality of to-be-collected character string selection buttons according to a preset rule. By clicking a selection button of a certain to-be-collected character string, Txt2re generates the regular expression corresponding to the to-be-collected character string.
There are at least the following problems in the conventional techniques:
When Txt2re tool is used to generate regular expressions in the conventional techniques, the acquirable regular expressions are limited, i.e., only regular expressions of character strings which satisfy certain rules are provided, and the regular expressions cannot be generated according to the needs of users. For example, in the above log, Txt2re tool provides the selection button corresponding to the character string including “[28/Jun/2013:14:53:08 +0800].” However, the square bracket in the character string is not of significance. So, if possible, a user generally will select the character string “28/Jun/2013:14:53:08 +0800”, i.e., the user needs a regular expression of the character string without the square bracket. However, in the conventional techniques, the user is not able to select, which thus, in the conventional techniques, results in poor flexibility for generating the regular expressions, and cannot satisfy the users' needs.