There has been dramatic progress in Web applications in the past decade. Web pages have evolved from static HyperText Markup Language (HTML) documents using content from a single site to seamlessly integrating dynamic content (using client-side scripting) from a variety of different Web sites. These so-called Web mashups offer an enriched Web experience to users.
A Web mashup is a website or Web application that seamlessly combines content (such as data and code) from multiple sources (which may even be competing sources) into an integrated experience for a user. For examples a real estate website may combine map data from one website with housing data from another website to present an integrated view of housing prices at various locations on a map. Web mashups may also involve gadgets, which are Web components containing both HTML content and scripting code that can be placed on any page on the Web. Gadget aggregators (such as Microsoft® Windows Live) aggregate gadgets into a single page to provide a desirable, single-stop information presentation to their users.
In a Web mashup application, content from different sources is integrated together to achieve the desirable functionality. This can be compared to a desktop application built on top of binary components from different vendors. A component is a unit of program structure that encapsulates its implementation behind an interface used to communicate across the components. An interface abstracts a component's agreed services that it provides to others from its actual implementation, and thus enhances reusability. This component-oriented programming has established itself as the predominant software development methodology over the last decade. It breaks a system down into binary components for greater reusability, extensibility, and maintainability. Several component technologies, such as COM/DCOM, CORBA, Java Beans, and .NET, have been used widely to allow an application with interchangeable code modules. This promotes “black box reuse”, which allows using an existing component without caring about its internals, as long as the component complies with some predefined set of interfaces. The component-oriented program paradigm, however, has not been used in Web applications and mashup systems.
Compared to the technologies used in desktop applications, Web applications still lag far behind. Although a Web page can combine content from different sources, the Web is still a monolithic architecture that does not support component-level abstraction. In other words, each functional part is glued statically at implementation time. Current Web standards and browsers allow scripting from other sources to be used and content from different sources to be aggregated, but the implementation is not separated from the services that the implementation provides. Moreover, features that are commonly used in component-oriented software development are not supported by current Web standards and browsers. For example, delayed binding and module interchangeability, which are widely used in component-oriented software development, are not supported by current Web standards and browsers.
Current Web standards and browsers use a binary trust model governed by a Same-Origin Policy (SOP), which prohibits documents or scripts of one origin from accessing documents or scripts of a different origin. Documents or scripts from the same origin can access each other without any restriction. SOP is used to protect against Cross Site Scripting (XSS) attacks. An origin consists of the domain name, protocol, and port. Two Web pages have the same origin if and only of their domain names, protocols, and the ports are identical. Each browser window, <frame>, or <iframe>, is a separate document. Each document is associated with an origin. A HTML document is accessed through the platform and language neutral interface Document Object Model (DOM). Programs and scripts can use DOM to dynamically access and update the content, structure, and style of documents. Scripts enclosed by <script> in a document are treated as libraries that can be downloaded from different domains, but run as the document's origin rather than the origin from which they are downloaded.
Another problem is that SOP presents Web mashup documents from different sources from interacting with each other, thus restricting the functionality that a mashup can deliver. To work around SOP, a proxy server can be used to aggregate the contents from different sources before sending to the client so that the mashup contents appear to be the same origin to the browser. However, one drawback of this approach is that the proxy server can be a bottleneck and unnecessary round trips are required.
Asynchronous JavaScript® and XML (AJAX) have been widely used to provide interactivity through client-side code with minimized impact on network and server performance. AJAX makes client-side mashups popular since client-side mashups reduce latency and bandwidth as compared to the proxy approach described above. A client-side mashup includes documents from various sites and makes them interact with each other at the client side. To circumvent SOP, a document in a client-side mashup includes scripts from the target sites in order to achieve cross-domain interactions. However, this requires the full trust of those sites since the included scripts have full access to the host document's resources. SOP's binary trust model forces Web programmers to make tradeoffs between security and functionality. Security is frequently sacrificed for functionality.
Web gadget aggregators enable a user to customize his or her portal page by selecting multiple third-party contents. Each content manifests as a gadget. A gadget in these applications is a separate frame, and SOP isolates one gadget from another as well as from the gadget aggregator. This severely restricts the functionality of a Web mashup. For example, a Web page may contain three gadgets from different origins: (1) a people gadget, which lists people; (2) a weather gadget, which shows a city's weather; and, (3) a map gadget, which shows a map. SOP prevents the weather and map gadgets from responding to a click on a person in the people gadget to show his home on the map gadget and the weather of his home on the weather gadget. To support this desired functionality, scripts from a different source need to be embedded with a full trust being granted.
New technologies have been proposed to offer client-side cross-domain communication mechanisms without sacrificing security. These technologies include cross-domain communications for Web mashups using a new type of <module> tag. This new <module> tag partitions a Web page into a collection of modules. A module is isolated except that JavaScript® Object Notation (JSON) formatted messages are allowed to communicate between a module and its parent document.
A similar scheme has been proposed for HTML 5 to provide cross-document communications, no matter whether documents belong to the same domain. Since documents are arranged in hierarchy structure, this proposal leverages the current abstraction of a document instead of proposing a new isolation abstraction like the <module>. One problem, however, is that though cross-domain communications are supported in this HTML 5 proposal, the communication receiver has to decide the trustiness of the sender by itself. This requires every component has its own access control system. Furthermore, DOM and JavaScript® resources are shared based on the same origin policy. Therefore, a separate domain per component is still required.
Adobe's Flash Player framework uses cross-domain policy files to configure and give the Flash Player permission to access data from a given domain without displaying a security dialog. Although this approach provides more flexibility and controls than standard SOP communication model, it depends on a configuration outside a browser, and the service provider cannot distinguish whether the requests originator comes from the same domain as the provider or not.
One technique has been proposed that provides a cross-domain communication mechanism without any browser plug-ins or client-side changes. This technique splits a site into sub-domains, using one of them to evaluate scripts from other domains, and another page to hold a notification object. Then the two sub-domain pages relax their domain to a common value to exchange information, and send information back via the held notification object. However, this technique is complex to use (especially for complex mashups), and may not work for certain domains. For example it is impossible to relax a domain (such as “a.com” or “192.168.0.1”) to create a parallel domain to receive partially trusted information. Thus, this technique does not work in these cases.
Approaches to communicate between <iframe>s by using the fragment identifier of the frame URL have been proposed, Modification of the URL fragment identifier dose not reload the page, and can be observed by frames from different domain, thus can be used to transport messages between frames. However, such communication is limited to the size of fragment identifiers (for example, the maximum length of a URL in Internet Explorer is 2,083 characters), and can be overheard by other frames.
One technique uses a browser plug-in to provide a fine-grained access control on read, write, and traverse actions of the DOM tree of a Web application. In order to safely isolate the DOM sub-tree of each component, policies are associated with parts of the DOM tree inside a Web page, such as defining a policy that only the component and the event hub can access and modify a communication zone between them. One problem, however, is that this technique prevents innocent parts from accessing potentially malicious parts of the DOM tree.
Another technique (called MashupOS) proposes to add several new elements to HTML. Among them, <Sandbox> and <OpenSandbox> tags are designed to consume unauthorized content without liability and over trusting. The <ServiceInstance> tag creates an isolated region to hold related memory and network resources. A <ServiceInstance> may also hold multiple display area resources by possessing some <Friv> nodes in the HTML document tree. MashupOS also provides browser-side communication across domains. <ServiceInstance>s may declare ports to listen to communication requests. Such a request can be sent from any script block by using a CommRequest object provided by MashupOS. However, this technique lacks desirable features such as support of contract-based channels or an abstraction of contract-based channels to promote interchangeability among gadgets and separation of a gadget's implementation from its actual deployment.