Information is conveniently stored in repositories such as databases. Various applications access and use the information stored in such databases. The value and importance of information to modern enterprises cannot be overstated. Efforts to secure information comprise a serious modern endeavor and database security is a major concern. However, databases and the applications that access information therein have become targets for malicious attacks such as “hacking,” which can compromise information by destruction, damage and/or misappropriation of data. Unfortunately, a large and developing array of attack techniques and modalities exist.
When a user of a client browser makes an input to a web application, the web application may generate one or more statements containing SQL (Structured Query Language), PL/SQL (Procedural Language/SQL) code or the like that proffers a query to a database. A database server accesses the information from the database and returns information to the web application, which returns the information to the client. The client browser then displays or otherwise presents the requested information to the user. It is noteworthy that the information queried from the database and presented to the user typically depends on the user input in that different input values may result in different information returned.
Modern, well-crafted, robust web applications are typically designed and configured to resist attacks from hackers, whose attacks may be aimed to corrupt, damage, destroy, and/or compromise information stored in a database. However, not all web applications in use are modern, robust or well crafted. Older web applications are still in use. Some older applications were designed and configured without regard to attack modalities that did not exist or which were not well known or especially troublesome at the time the web application was deployed.
Further, some more modern web applications may lack one or more features that would otherwise harden the applications against attack. For instance, applications prepared in ad hoc efforts, inartfully or amateurishly, and those developed under time constraints or low budgets, may be particularly vulnerable in this sense. Moreover, even modern, robust and well-crafted web applications may suffer design flaws and configuration errors, which can leave the applications vulnerable in similar ways.
Stored information can be corrupted or damaged by a hacker making unauthorized, typically incorrect, random, or nonsensical changes to stored data values or similarly flawed entries to data fields. Stored information can be destroyed by a hacker deleting tables, object classes or rows and objects contained respectively therein. Stored information can be compromised by a hacker gaining unauthorized access to data that are supposed to be confidential, private, and protected from disclosure.
Such attacks take many forms, but one particular attack modality that has been used successfully by hackers is the so-called SQL injection attack.
An instance of a more general class of vulnerabilities that can arise where a scripting or programming language is embedded within another language, SQL injection refers to a class of security related vulnerabilities in the data access layer of an application. Successful SQL injection allows hackers to execute queries in SQL, PL/SQL or the like, which are unintended by the application and allow the hacker unauthorized access to stored data.
SQL injection attacks are typically launched using cleverly modified input strings to the application. A vulnerable application is compromised by manipulative exploitation of these relatively simple, but potentially pernicious, strings. SQL injection vulnerability is associated with inadequate filtering of input strings for string literal escape characters combined with the (full or partial) inclusion of those input strings in SQL statements proffered to the database, resulting in the unexpected execution of SQL that is modified with user input that is not strongly typed.
Client/Server/Database Operation
For instance, many modern web applications function with a three-tier configuration. Web applications frequently comprise a layer of Java or related code that is hosted by a middle tier web application server. Web applications receive inputs from web browsers, which are hosted by front-end tier client computers. In response to a browser request for a particular web page or other document, a web application accesses or retrieves the document and returns the document to the browser, which renders the webpage and displays the page at the client.
Frequently, information that the web application accesses and passes back to the browser associated with the document is stored in a relational or other database in a back-end tier, from which the information is retrieved. The database accesses the information by executing a statement in SQL or similar database query language. The database tier returns the information to the web application, which may format or otherwise process the information before sending it to the browser for display. Under some circumstances, the web application accepts input from the browser and, upon processing the input, uses some or all of the input to transform or modify the SQL query proffered to the database. For simplicity and brevity of explanation, the term SQL will be used herein, although it should be understood that references herein to SQL are examples that are intended to refer to PL/SQL and other database languages.
Information exchange in this three-tier environment may be explained with the following example. Using a client browser, the user opens a webpage associated with a hypothetical book-vending enterprise, “http://www.bookelephant.com,” and uses a search feature associated therewith to find books by a certain author, such as “Beresniewicz.” A web server may be associated with “www.bookelephant.com.” The web server may host a web application associated with the enterprise. The search feature may include a data entry field displayed by the browser as the webpage is rendered. The browser accepts textual user inputs therewith. The webpage sends the inputs to the web application encoded as XML (Extensible Markup Language) or similar data transfer encoding. Upon receipt of the input from the browser, the web application transforms the encoded input:                Books/Beresniewicz/Searchinto a SQL statement. For instance, the example encoded input above may be transformed into a SQL query as shown below:        
selecttitlesfrombookswhereauthor = ‘Beresniewicz’;The “single quote” (i.e., apostrophe) marks around the input data “Beresniewicz” in the query denote that this query token is a string literal. The web application places these single quote marks around the input data (“Beresniewicz”) as part of transforming the input into the SQL statement proffered to the database. From one or more databases associated with “www.bookelephant.com,” the titles of books that are stored therein are retrieved in response to the query and supplied to the web application. The web application returns the titles to the browser in HTML (Hypertext Markup Language) or similar code. The browser renders the HTML with a new or revised webpage. The browser displays the webpage for the user. However, the way the web application transforms user-entered request inputs into SQL queries in this example leaves the web application vulnerable to a SQL injection attack.
SQL Injection Attacks
SQL injection attacks target the web application's transformation of all or part of user request inputs (e.g., search query terms) into SQL database queries. A hacker's clever manipulation and use of various strings in inputs to a web application can cause the web application's transformation thereof to generate queries that are not intended by the web application. When executed by a database server, such queries can return data that the hacker lacks authorization to access. The hacker essentially exploits the transform to execute potentially pernicious and/or nefarious SQL statements.
A hacker can exploit the query structure resulting from the web application's transform of the user input. For example, a hacker may modify the author's name “Beresniewicz” to read as shown below.
Beresniewicz‘∥’ union select password from security_table
In this example, the hacker cleverly inserts the single quote mark followed by the concatenation symbol (∥), followed by another single quote mark and the phrase “union select password from security_table.” Unfortunately, upon receiving and processing this input, the vulnerable web application transforms this input request into a SQL statement that includes a pernicious and unintended query such as the example shown below.
selecttitlesfrombookswhereauthor = ‘Beresniewicz’unionselectpasswordfromsecurity_table;What the hacker has accomplished here is to “inject” SQL code within the browser input that was sent to the vulnerable web application. This injected SQL essentially “tricks” the web application into generating a SQL query that the web application is not intended to generate. The “trick” exploits the fact that the application simply concatenates single quotes around the input data and appends the resulting string literal token to the SQL statement without further validation of the input data. Such simplistic transformations of input data into SQL statements are unfortunately a common form of SQL injection vulnerability. Ominously in this example, SQL injection enables the hacker to access private passwords (e.g., of other users) from a security table in the database being queried. This is private, sensitive, and potentially damaging information to which the hacker is not entitled. Moreover, there is possibly no evidence that the sensitive, private data has been compromised. This dearth of evidence that the attack has occurred can be equally ominous.
Detecting Vulnerable Applications
In the example above, the SQL injection attack string was added to the end of valid input (“Beresniewicz”). However, other SQL strings can be injected at other parts of the input, as well. It is appreciated that not every such injection attack will allow successful access to unauthorized data sought by the hacker. Frequently (perhaps more often than not) a hacker will receive an “Error” message in response to a particular input. The hacker may essentially have to guess at the workings of a suspected vulnerable web application, specifically to determine how input data is transformed into SQL statements and whether these transforms can be exploited by injection attacks. From parsing some such error statements generated by one or more input attack strings and strategies, however, the hacker may glean clues for refining the injection attack. In this way, the hacker may eventually succeed. A successful SQL injection attack is called an exploit.
Various efforts are made to harden modern web applications against such attacks and to prevent vulnerabilities to exploits. For instance, more informed or sophisticated application programs may require statements to associate user input data with bind variables in SQL statements, which prevents unintended SQL statements from being injected with queries as input data does not transform the resulting query. As discussed above however, not all of the millions of existing web applications are so hardened.
Furthermore, detecting vulnerabilities in web applications poses a number of challenges. Confirming the vulnerability of a particular web application can involve interpretation of webpage response information that is very specific to the application. This can be difficult to automate and error prone as error messages can vary widely from application to application and even within an application from input to input. Large web applications can be associated with large numbers of web pages. Testing such large applications for SQL injection vulnerability can therefore be tedious, error prone, time consuming and expensive. Moreover, web pages may respond in variable ways, depending upon the particular inputs they may receive. Current techniques exist that automate SQL injection vulnerability testing from the input side (e.g., by automatically injecting common attack strings for data inputs at the front end). However, the variability in page response possibilities means that it is difficult to ascertain with these techniques, under all possible input circumstances, whether SQL injection vulnerability exists or not.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
Based on the foregoing, it could be useful to reduce the tedium, error probability and time and expense involved in detecting SQL injection vulnerability in web applications.