The Internet is by far the largest, most extensive publicly available network of interconnected computer networks that transmit data by packet switching using a standardized Internet Protocol (IP) and many other protocols. The Internet has become an extremely popular source of virtually all kinds of information. Increasingly sophisticated computers, software, and networking technology have made Internet access relatively straightforward for end users. Applications such as electronic mail, online chat and web client allow the users to access and exchange information almost instantaneously.
The World Wide Web (WWW) is one of the most popular means used for retrieving information over the Internet. The WWW can cope with many types of data which may be stored on computers, and is used with an Internet connection and a Web client. The WWW is made up of millions of interconnected pages or documents which can be displayed on a computer or other interface. Each page may have connections to other pages which may be stored on any computer connected to the Internet. Uniform Resource Identifiers (URI) is an identifying system in WWW, and typically consists of three parts: the transfer format (also known as the protocol type), the host name of the machine which holds the file (may also be referred to as the web server name) and the path name to the file. URIs are also referred as Universal Resource Locators (URLs). The transfer format for standard web pages is Hypertext Transfer Protocol (HTTP). Hyper Text Markup Language (HTML) is a method of encoding the information so it can be displayed on a variety of devices.
Web applications are engines that create Web pages from application logic, stored data, and user input. Web applications often preserve user state across sessions. Web applications do not require software to be installed in the client environment. Web applications make use of standard Web browser components to view server-side built pages. Web application can also deliver services through programmatic interface like Software Development Kits (SDKs).
HTTP is the underlying transactional protocol for transferring files (text, graphic images, sound, video, and other multimedia files) between web clients and servers. HTTP defines how messages are formatted and transmitted, and what actions web servers and web client browsers should take in response to various commands. A web browser as an HTTP client, typically initiates a request by establishing a TCP/IP connection to a particular port on a remote host. An HTTP server monitoring that port waits for the client to send a request string. Upon receiving the request string (and message, if any), the server may complete the protocol by sending back a response string, and a message of its own, in the form of the requested file, an error message, or any other information. The HTTP server can take the form of a Web server with gateway components to process requests. A gateway is a custom web server module or plug-in created to process requests, and generally is the first point of contact for a web application. The term “gateway” is intended to include any gateways known to a person skilled in the art, for example, CGI; ISAPI for a web server; a web server module, or a servlet.
Web pages regularly reference to pages on other servers, whose selection will elicit additional transfer requests. When the browser user enters file requests by either “opening” a Web file by typing in a Uniform Resource Locator (URL), or clicking on a hypertext link, the browser builds an HTTP request. In actual applications, Web clients may need to be distinguished and authenticated, or a session which holds a state across a plurality of HTTP protocols may need to be maintained by using “state” called cookie.
Web applications incur a security risk by accepting user input in their application logic. To reduce this risk, Web application firewall validates input to Web application before it is used in application logic. Web application entry point, for example, application firewall typically examine incoming request, apply generic security rules, and reject requests that fail to comply with these rules. A security rule can for example reject value longer than 256 characters for a title parameter. Web application validation rules are tied to a specific Web application. A Web application firewall will be used by many components of a Web application. Each component having it corresponding set of validation rules.
Dynamic content may be provided by service-oriented (SOA) for distributed computing. In general, SOA integrates distributed applications in the Internet which are loosely coupled, highly interoperable. SOA provides an interface describing a collection of network-accessible operations, and encapsulating the vendor and language-specific implementation. A SOA is independent of development technology, for example, a C# service could be used by an application of a different language-specific implementation than C#. Web services as defined in SOA work with other web services in an interoperable manner to carry out respective part of a complex business transaction. For example, completing a purchase order may require automated interaction between an order placement service and an order fulfillment service.
One of the convenient and therefore widely used methods for accessing information over the Internet is to access information over the WWW using web portals. Web portals offer a structured approach to provide personalized capabilities to the visitors, e.g. by subject (category) then sub-category.
Web portals serve as a common access point to various distributed applications that are related, for example, to programs of a business. Web portals use distributed applications, different numbers and types of middleware, and hardware to provide services from a number of different sources. Web portals may be delivered in a hypertext markup language document (i.e., a HTML web page) over a public network such as Internet. For example, Web portals may be accessible at a single uniform resource locator (URL) address and integrate disparate, but related databases and systems of a business. Web portals therefore can deliver customized content within a standard template and using a common user interface mechanism.
Web portals usually aggregate dynamic Web content, which is processed and generated by portlets. Many portlets may therefore be invoked in a single request of a portal page. Each portlet obtains a fragment of the content that is to be rendered as part of the portal page. Portlets therefore summarize, promote or provide basic access to an information resource for a group of users who find business value in the information. Access may also include secure resources through integrated portal authentication and single-sign on.
U.S. Pat. No. 6,985,939 describes a portlet model leveraged to allow programmatic portlets to serve as proxies for web services, and extending portlets beyond their visual role. U.S. Pat. No. 6,985,939 further describes a deployment interface and a system interface for the portlet proxies. The deployment interface is used for composing new web services. The system interface allows for run-time management of the web services by the portal platform.
US Application 20060041637 describes a portal server which uses a reverse proxy mechanism for proxying Web applications on a backend server in response to a request for Web content from a user. The reverse proxy mechanism has a portlet, a set of configuration rules, and a rewriting mechanism. The rewriting mechanism is configured to forward a user request for Web content to a Web application on the backend server, receive a response from the Web application, and rewrite the received response in accordance with the configuration rules. The portlet is configured to produce a content fragment for a portal page from the rewritten response. The configuration rules include rules for rewriting any resource addresses, such as URLs, appearing in the received response from the Web application to point to the portal server rather than to the backend server. A separate backend server is behind a firewall and the reverse proxy function of the portlet allows a user to access the Web application on the portal server, without the need to allow the user to have direct access to the backend server and backend application which provide the actual content.
Custom portlets may be created by any application or information resource provider, based on published specifications. A Web application supporting custom portlets need to allow data of these portlets to pass through its entry point, for example, an application firewall without compromising the security for its other components.
Therefore, there is a need for a method and apparatus that allows requests targeted at specific components, for example, custom portlets, to bypass validation in a secure manner.