There is often a need for two users in remote locations on two separate computing devices to view content simultaneously to, for example, facilitate a demonstration or provide support to a customer. This is commonly known as “screen-sharing,” or “co-browsing.” In some cases, co-browsing can be accomplished by taking screenshots of the content shown at a customer device, and delivering the images to a customer service device, where they are displayed. Where web technologies are used, including a Document Object Model (DOM), a co-browsing system can instead operate by duplicating the contents of the DOM displayed by the customer device on the customer service device. As discussed herein, aspects of the present disclosed technology can implement a DOM-based co-browsing system. In some embodiments, the disclosed technology can even function where the DOM displayed on the Customer Device contains elements embedded in Shadow DOM's, as discussed herein.
The architecture of the World Wide Web (Web) follows a conventional client-server model. The terms “client” and “server” are used to refer to a computer's general role as a requester of data (the client) or provider of data (the server). Under the Web environment, Web browsers reside in clients and specially formatted “Web documents” reside on web servers. Web clients and Web servers communicate using a protocol called the Hyper Text Transfer Protocol (HTTP). In operation, a browser opens a connection to a server and initiates a request for a document or a Web page that includes content. The server delivers the requested document or Web page, typically in the form coded in a standard Hyper Text Markup Language (HTML) format. After the document or Web page is delivered, the connection is closed and the browser displays the document or Web page to the user.
To define the addresses of resources on the Internet, a Uniform Resource Locator system was created that uses a Uniform Resource Locator (URL) as a descriptor that specifically defines a type of Internet resource and its location. URLs have the following format: “resource-type://domain.address/path-name.” The “resource-type” defines the type of Internet resource. Web documents, for example, are identified by the resource type “http”, which indicates the protocol used to access the document.
To access a document on the Web, the user enters a URL for the Web document into a browser program executing on a client system with a connection to the Internet. The Web browser then sends a request in accordance with the HTTP protocol to the Web server that has the Web document using the URL. The Web server responds to the request by transmitting the requested object to the client. In most cases, the object is a plain text document containing HTML.
Once a web browser receives the HTML, it renders the HTML to present to the user. The HTML may reference other resources necessary to render the HTML, such as images, video, and other media, as well as style information, and script code. The style information can comprise a separate Cascading Style Sheet (CSS) document that contains information about how the HTML document should be styled, or the style information can be directly embedded in the HTML code. Similarly, the script code can be embedded in the HTML document, or sent as a separate file, such as a Javascript file. The browser then downloads all the resources referenced by the HTML, and renders the web page.
Modern websites frequently leverage HTML, CSS, and Javascript to create highly dynamic content, commonly referred to as a “Web Application.” The HTML defines the overall structure of the webpage, and CSS defines the style information for how the document is displayed. Javascript can provide interactivity, and can dynamically change the contents of webpage. However, there is no discernable line between “web pages” and “web applications,” and both terms are used synonymously herein.
Disclosed herein is a method for co-browsing that takes advantage of the DOM used in web technologies. That is, web applications are rendered by browsers as a tree of elements. These elements can contain text, tables, images, and other content. Modern web applications can change the document viewed by a user by adding, updating, or deleting elements from the DOM. In a co-browsing context, while the view seen by the customer device and the customer service device will initially be the same for a given web request, this dynamic content can cause the content shown at the customer device to differ from the content shown at the customer service device.
Further, modern application software can take advantage of these technologies as well. For example, the Electron™ framework allows desktop applications to be built using this same technology. While some embodiments of the present technology include co-browsing web applications retrieved from the internet, other embodiments can be used to co-browse a desktop application built on web technologies, so long as the desktop software includes a component that implements a data structure similar to the DOM. As used herein, the terms “Web Page” and “Web Application” include applications that implement a data structure similar to the DOM.
As web applications grow in size and complexity, it can be useful to decompose the web application into smaller discrete components. Decomposed web applications can be easier to maintain, easier to divide responsibilities across engineering teams, and can facilitate re-use of common components. Further, modern HTML standards often allow for tag types that require custom implementations by browsers, such as a <video> tag to display a video. To facilitate this decomposition, modern internet browsers can implement a set of standards called “Web Components” that allow developers to write reusable components to build web pages and web applications. To provide backward compatibility, libraries also exist to provide Web Components services on otherwise incompatible browsers through the use of polyfills, which are software libraries that emulate the functionality of features unavailable on certain browsers.
Web Components includes a feature called a “Shadow DOM.” A Shadow DOM is a type of sub-document that can be encapsulated and attached to an element of an HTML page. In general, two types of Shadow DOM's exist: “open” and “closed.” Open Shadow DOM's are accessible to web applications, which can view, edit, and delete any element within the Shadow DOM, as well as the Shadow DOM itself. Closed Shadow DOM's are not accessible to web applications, but are viewable by a user. A web application searching for closed Shadow DOM's will neither see the existence of the Shadow DOM itself, nor any of the elements within the Shadow Dom. However, the “open” and “closed” behavior of Shadow DOM's is implemented by a browser, and a browser can be configured to modify the behavior of Shadow DOM's to, for example, allow web applications to access closed Shadow DOM's.
The existence of Shadow DOM's can require added complexity to methods for co-browsing. In particular, automatic traversal of the DOM using standard tools will frequently not identify the existence of a Shadow DOM, or the elements contained therein. Further, due to the dynamic nature of Web Applications, methods are necessary to identify changes to the DOM rendered by the Web Browser to synchronize the web page rendered at the customer's device and the customer service device. While mutation observer software elements can detect changes to the primary DOM, they frequently are unable to identify changes in the Shadow DOM, leading to differences displayed between the web application viewed at the customer device, and the web application viewed at the customer service device.