The present invention relates to the field of web server vulnerability assessments and, more specifically, to an integrated crawl and auditing process for web servers and web applications.
In the world of high-tech, electronics and computer systems, as well as almost every consumer electronics device, the key marketing thrust is “make it smaller”. Thus, the electronic products available to use are constantly shrinking in size. However, there are two aspects of the high-tech industry that are not only refusing to shrink, but indeed are actually growing at quite a rapid rate. These two aspects include memory capacities and software program and/or data. Fortunately, the physical sizes of memory devices are shrinking. It would be quite a daunting sight to see a 600 Gigabyte drive 15 years ago.
And what is all this memory being used for? A good portion of it is being consumed by increasingly sophisticated and complex web sights. The typical 1-2 Megabyte, limited page web site is being replace by huge, intricate and detailed web sites full of web applications, data stores, information and the like.
Unfortunately, the free exchange of information, so easily facilitated by personal computers over the Internet, has spawned a variety of risks for the organizations that host that information. This threat is most prevalent in interactive applications hosted on the World Wide Web and accessible by almost any personal computer located anywhere in the world. Web applications can take many forms: an informational Web site, an intranet, an extranet, an e-commerce Web site, an exchange, a search engine, a transaction engine, or an e-business. These applications are typically linked to computer systems that contain weaknesses that can pose risks to a company. Weaknesses can exist in system architecture, system configuration, application design, implementation configuration, and operations. The risks include the possibility of incorrect calculations, damaged hardware and software, data accessed by unauthorized users, data theft or loss, misuse of the system, and disrupted business operations.
As the digital enterprise embraces the benefits of e-business, the use of Web-based technology will continue to grow. Corporations today use the Web as a way to manage their customer relationships, enhance their supply chain operations, expand into new markets, and deploy new products and services to customers and employees. However, successfully implementing the powerful benefits of Web-based technologies can be greatly impeded without a consistent approach to Web application security.
It may surprise industry outsiders to learn that hackers routinely attack almost every commercial Web site, from large consumer e-commerce sites and portals to government agencies such as NASA and the CIA. In the past, the majority of security breaches occurred at the network layer of corporate systems. Today, however, hackers are manipulating Web applications inside the corporate firewall, enabling them to access and sabotage corporate and customer data. Given even a tiny hole in a company's Web-application code, an experienced intruder armed with only a Web browser (and a little determination) can break into most commercial Web sites.
The problem is much greater than industry watchdogs realize. Many U.S. businesses do not even monitor online activities at the Web application level. This lack of security permits even attempted attacks to go unnoticed. It puts the company in a reactive security posture, in which nothing gets fixed until after the situation occurs. Reactive security could mean sacrificing sensitive data as a catalyst for policy change.
A new level of security breach has begun to occur through continuously open Internet ports (port 80 for general Web traffic and port 443 for encrypted traffic). Because these ports are open to all incoming Internet traffic from the outside, they are gateways through which hackers can access secure files and proprietary corporate and customer data. While rogue hackers make the news, there exists a much more likely threat in the form of online theft, terrorism, and espionage.
Today the hackers are one step ahead of the enterprise. While corporations rush to develop their security policies and implement even a basic security foundation, the professional hacker continues to find new ways to attack. Most hackers are using “out-of-the-box” security holes to gain escalated privileges or execute commands on a company's server. Simple misconfigurations of off-the-shelf Web applications leave gaping security vulnerabilities in an unsuspecting company's Web site.
Passwords, SSL and data-encryption, firewalls, and standard scanning programs may not be enough. Passwords can be cracked. Most encryption protects only data transmission; however, the majority of Web application data is stored in a readable form. Firewalls have openings. Scanning programs generally check networks for known vulnerabilities on standard servers and applications, not proprietary applications and custom Web pages and scripts.
Programmers typically don't develop Web applications with security in mind. What's more, most companies continue to outsource the majority of their Web site or Web application development using third-party development resources. Whether these development groups are individuals or consultancies, the fact is that most programmers are focused on the “feature and function” side of the development plan and assume that security is embedded into the coding practices. However, these third-party development resources typically do not have even core security expertise. They also have certain objectives, such as rapid development schedules, that do not lend themselves to the security scrutiny required to implement a “safe solution.”
Manipulating a Web application is simple. It is often relatively easy for a hacker to find and change hidden fields that indicate a product price. Using a similar technique, a hacker can also change the parameters of a Common Gateway Interface (CGI) script to search for a password file instead of a product price. If some components of a Web application are not integrated and configured correctly, such as search functionality, the site could be subject to buffer-overflow attacks that could grant a hacker access to administrative pages. Today's Web-application coding practices largely ignore some of the most basic security measures required to keep a company and its data safe from unauthorized access.
Developers and security professionals must be able to detect holes in both standard and proprietary applications. They can then evaluate the severity of the security holes and propose prioritized solutions, enabling an organization to protect existing applications and implement new software quickly. A typical process involves evaluating all applications on Web-connected devices, examining each line of application logic for existing and potential security vulnerabilities.
A Web application attack typically involves five phases: port scans for default pages, information gathering about server type and application logic, systematic testing of application functions, planning the attack, and launching the attack. The results of the attack could be lost data, content manipulation, or even theft and loss of customers.
A hacker can employ numerous techniques to exploit a Web application. Some examples include parameter manipulation, forced parameters, cookie tampering, common file queries, use of known exploits, directory enumeration, Web server testing, link traversal, path truncation, session hijacking, hidden Web paths, Java applet reverse engineering, backup checking, extension checking, parameter passing, cross-site scripting, and SQL injection.
Assessment tools provide a detailed analysis of Web application and site vulnerabilities. FIG. 1 is a system diagram of a typical structure for an assessment tool. Through the Web Assessment Interface 100, the user designates which application, site or Web service resident on a web server or destination system 110 available over network 120 to analyze. The user selects the type of assessment, which policy to use, enters the URL, and then starts the process.
The assessment tool uses software agents 130 to conduct the vulnerability assessment. The software agents 130 are composed of sophisticated sets of heuristics that enable the tool to apply intelligent application-level vulnerability checks and to accurately identify security issues while minimizing false positives. The tool begins the crawl phase of the application using software agents to dynamically catalog all areas. As these agents complete their assessment, findings are reported back to the main security engine through assessment database 140 so that the results can be analyzed. The tool then enters an audit phase by launching other software agents that evaluate the gathered information and apply attack algorithms to determine the presence and severity of vulnerabilities. The tool then correlates the results and presents them in an easy to understand format to the reporting interface 150.
However, Web sites that extend beyond the rudimentary level of complexity that simply includes HTML to be rendered by a browser, can include a variety of sophisticated elements such as JAVA code, applets, Web applications, etc. The traditional approach of crawling through the HTML of a Web site is limited in the amount of information that can be obtained and analyzed. For instance, a Web site may include a PDF file that includes, within the text of the PDF file, additional links. The traditional Web crawler technology may obtain the link to the PDF file during the crawling phase of the attack, but the links embedded within the PDF file would be ignored during the second phase of the attack.
FIG. 2 is a block diagram showing the flow of operations for a prior art system that conducts a two-phased vulnerability assessment including a crawling phase and an auditing phase. Initially, a crawler 210 is configured 201 to initiate the crawling phase of the assessment. Once configured, the crawler 210 begins making discovery requests 202 to the web server 200. Each request results in a response 203 which is then stored into database 230. Feedback 204 may be provided to the crawler 210 to further configure or augment the operation of the crawler 210. Thus, the crawling phase consists of multiple trips through the process identified as Loop 1 which consists of multiple sessions, where each session includes a discovery request 202 followed by a response 203 and possible feedback 204.
Once the crawling phase is completed, the auditing phase commences. During the auditing phase, the auditor 220 is configured 205 based on data stored in database 230 during the crawling phase. The auditor 220 then makes attack requests 206 against the web server 200. Each attack request results in obtaining a response 207 which is then stored into the database 230. Thus, the auditing phase consists of one or more trips through the process identified as Loop 2 which consists of one or more sessions, where each session includes an attack request 206 followed by a response 207 and further configuration 205 as necessary.
As described in the parent application, the crawling process can be quite intensive and if a recursive crawl is implemented, the amount of data accumulated during the discovery and response sessions can be quite large. In addition, with the complexity of modern-day web sites, the crawling process can take a considerable amount of time. Thus, with a two-phased approach, a user may have to wait on the order of hours, or even days before he can start to see results from the auditing process. In addition, for larger web sites, there is a risk of data overflow as the crawler generates excessive amounts of data.
Thus, there is a need in the art for a solution that can provide a more efficient and expedient mechanism to perform a vulnerability assessment and to obtain auditing results during a vulnerability assessment.