The present invention relates generally to Internet usage and, more particularly, to preserving privacy and confidentiality when reporting Internet URLs.
Many Web 2.0 applications at some point report to a third party server the universal resource locators (URLs) of web pages that users visit. In particular, a subclass of applications use the URL as a database key to store metadata associated to the resource that may be produced in a variety of ways—by users themselves or by application-specific algorithms. When a user requests a URL, a browser extension or a web widget (i.e. a piece of code embedded in the web page) also contacts a server asking for this associated metadata, which is then displayed to the user or used to modify the user experience. Examples of these services are social annotations (e.g. diigo) and PageRank reporting (part of the Google toolbar). In the former example, the metadata (annotations) is produced by users and is intended to be shared among them; in the latter case, the metadata (the PageRank) is produced by Google's algorithms.
To illustrate how such systems may operate, the operation of an online web page annotation application is described below. The application allows users, on the one hand, to annotate any web page on the web—that is, to associate to (parts of) the web page comments, descriptions, observations, assessments, etc. On the other hand, it allows (possibly different) users to browse web pages and at the same time retrieve all associated annotations, displaying them e.g. by overlaying them onto the original web page. In this way and in a typical web 2.0 spirit, a common base of annotations is created such that users may benefit from each others' annotation efforts.
The application consists of a server part for storing the annotations and two client parts embedded in a browser (as browser extensions or web widgets): one for creating the annotations and one for retrieving and viewing existing annotations. Such an application may operate as follows: A first user who wishes to annotate a particular page having a particular URL loads the page into his browser. The first user then utilizes the browser extension or web widget for creating an annotation and submits the annotation to a server configured to store the annotations. The submission may be in the form [U, context, annotation] where U is the URL for the web page, context contains a permalink to the exact version of the web page and a fragment identifier, and annotation is the annotation provided by the first user. In more detail, “context” as the term is used herein, refers to information necessary for re-displaying the annotation in the same conditions as it was created—such as the URL for the exact version of the webpage (permalink) and the refining of a web page into a particular location on the web page (fragment identifier). For instance, if a web page includes six paragraphs, the context may point to a particular one of the paragraphs on the web page, which the annotation refers to.
The server that stores the annotation may then store the annotations under a database key that is the same as the URL for the page to which the annotations apply. At a later time, a second user (which may be the same user as the first user or a different user) loads the web page having the URL U and, utilizing a browser extension or web widget, contacts the server to retrieve the annotations associated with the web page. Using the URL for the web page as the key into the database, the server locates the annotations associated with the web page and transmits them to the requester. The annotations, including their contexts are received by the requester and then displayed on the web page at the indicated context.
The previous example illustrates how privacy and confidentiality of both the first and second users are being compromised. Because both the first and second user had to identify to the server the particular web page they were browsing in order to either store or retrieve the annotations, the server may create a record of the web pages visited for each user; recording the user's browsing history violates user privacy. In addition, such URLs themselves may contain confidential information if they belong to intranet resources, and thus confidentiality may be violated as well. In short, using URL's as a key to database gives away too much information to the server and may result in privacy and confidentiality concerns.
There exists a need, therefore, to allow for all of the functionality of systems as described above without revealing the URL of particular web pages viewed by users.