1. Field of the Invention
This invention pertains in general to computer security and incident management in database intrusion detection and prevention systems, and more specifically to organizing and aggregating database-specific incidents to improve a user's experience and make a computer system more manageable.
2. Description of the Related Art
Databases are widely used by companies to store both public and private information. Commonly, companies provide interfaces to their databases that are publicly accessible. For example, a website associated with a store might include a field that allows a user to type in a search term, allowing a user to retrieve a list of items of a particular type that are sold by the store (i.e., in a search field for product type, the user might type in “books” to retrieve a list of books sold by the store). This search field is a publicly-accessible interface to a database that resides behind store's application server, and the database stores data describing the items for sale.
Many of these publicly-accessible databases work by having a web server provide a web browser executing on a client computer with an HTML and/or JavaScript-based form. The web browser displays this form on the client, and the end-user provides values for the fields in the form (i.e., the user inserts a search term into a “search field”). The end-user performs an action, such as pressing a “Submit” button that causes the web browser to send the entered values to the server. At this point, back-end logic at the server translates this information into one or more queries that issue to the database using the user-supplied values. This query executes on the database and the server returns the results to the client web browser (i.e., user receives the search results).
Databases can suffer malicious attacks in which a malicious query is sent to the database to cause damage in some manner, such as to obtain access to confidential information stored in the database. In an SQL (Structured Query Language) injection attack, the attacker fills out the form using specially-crafted data. These data, when used by the server to generate a query to the database, result in a malicious query being sent to the database on behalf of the attacker. The malicious query executes on the database and results in a malicious action. By using these techniques, the attacker can inject code to obtain access to credit card numbers and other confidential information, modify or delete information on the database, or perform other malicious actions. As used herein, a malicious query can also include any query with potentially anomalous or undesired effects (e.g., a user error in which the user has deleted important data from a database by accident, rather than deleting such data with bad intent).
Database intrusion detection systems (“DIDS”) can help thwart these types of malicious attacks. An anomaly-based DIDS commonly resides between the front-end application and the back-end database. The DIDS can be trained to learn and recognize over the course of several days or weeks, the set of acceptable queries issued to a database by clients (e.g., applications like PEOPLESOFT® or SAP® (Systems Applications and Products in Data Processing).
Typically, queries sent to a database are going to appear in the same format over and over again, since the web browser displays the same set of forms with fixed search fields to the end-users. Only the search term entered by the end-user will differ. Thus, databases, unlike many applications, are uniquely suited for anomaly-based intrusion detection, because the set of queries sent by the application to the database is often so consistent over time that it is rare for a query to appear that has not been seen before. The DIDS, once trained to recognize legitimate queries, monitors all incoming queries and reports when a query is sent that fails to match one of the learned queries.
Ideally, this DIDS approach should result in few false positives (i.e., reports of anomalous queries that turn out to be legitimate). However, companies do update applications on a regular basis, in many cases changing the logic of the application significantly. These modifications sometimes result in changes to the queries that are issued by the application to the database. If the change occurs after the training period, the anomaly-based DIDS system may report a false positive. If the query is used repeatedly (i.e., if it is a high-volume query that many users are making regularly), this could result in thousands of false positives within minutes of the change to the application.
As an example, consider the following query to a website selling books (e.g., amazon.com):                select * from book_table where title=‘sometitle’Now consider what would happen if the bookseller changed its query, as it evolved from being a bookseller to a seller of many different types of products (including books):        select * from product_table where product_name=‘sometitle’        
Assuming the database schema was properly changed to support this query and the database repopulated with the proper data, this query would provide the same functionality for the user as the original query, yet it looks entirely different. Consequently, this updated query could be detected as anomalous by a DIDS system. If this query were issued each time a user searched for a book on the bookseller's website, it might be used hundreds or thousands of times per second by users around the world. This would result in a flood of hundreds or thousands of (false positive) incidents being generated per second. Without proper aggregation of these incidents, such a flood would overwhelm the database administrator and make it nearly impossible to manage the DIDS system.
In addition, even in a situation where the detected anomalous queries represent true attack scenarios rather than false positives, it is useful to aggregate the same type of attack into a single incident for management purposes. Rather than receiving a report of each and every attack, a database administrator will receive a single report for a group of similar attacks.
Thus, there is a need in the art for a means to reduce the impact of false positives of all types by aggregating related database intrusion incidents in an intuitive and manageable fashion to reduce and/or eliminate such flood conditions and to make the DIDS system more manageable during updates of enterprise applications and database schema.