1. Field of the Invention
The present invention relates generally to data transfer between computer resources and more particularly to data transfer between computer systems that communicate by means of a network.
2. Description of Related Art
All publications, patents and patent applications cited within this application are herein incorporated by reference in their entirety to the same extent as if the disclosure of each individual publication, patent application or patent was specifically and individually indicated to be incorporated by reference in its entirety.
In recent years the popularity of computers, and the communication networks established between these computers, have increased dramatically. Such networks allow computer users, either in a business, government or personal setting, to communicate with each other, either through a centralized communication point, through a plurality of distributed and redundant communication points, or directly. This allows exchange of information between the computers on the communication network, using a common communication protocol between them. It is common for corporations or business to establish a common communications network between their computers, otherwise referred to as “intranets”, in which the communication network has limited or no access to unauthorized persons and/or computers. It is common for intranets to be protected by security systems, such as firewalls, which prevent access by unauthorized users of the communications network, the computers communicating through it, and the information contained within these computers.
The term “Internet” has been adopted to describe the publicly available network which has nearly worldwide coverage, and to which most personal computers have access. The pervasive nature of the Internet, combined with the lower cost and increased performance of personal computers, has led to it being a popular source of information. Systems are available which provide an individual with the ability to search for information or resources within the Internet. For example, systems exist which allow a user to search for information stored on other Internet computers (i.e., servers), thus providing generalized access to these resources. Unfortunately, when an individual is searching for specific information, the resource on the Internet may not provide the specific information desired by the individual, or else it may provide certain information in an undesired context. The individual may then continue searching, or else use an alternate system to perform the required searching activities. In general, these searching systems provide minimal ability for a user to provide feedback as to the success of the search, or ways for the user to refine future searches. Generally, the user establishes a series of search terms to initiate a search, and upon failure of the search results to provide the user with what he is looking for, the user modifies or adds further search terms in an effort to increase the chance of success on the next search. Alternatively, the user may switch to an alternate search system and attempt to obtain a successful search result using that second system.
Computers communicate within a network using a common set of standards for exchanging data. One common example is the Transmission Control Protocol/Internet Protocol (TCP/IP) suite. To initiate communications within the communication network, a user (client) may contact another computer on the network (server) and request information or a resource. This is facilitated by various software and hardware systems generally available. A user can access resources within the Internet by being directed through software (e.g., by clicking a hyperlink), by entering a Universal Resource Locator (URL), etc.
A popular protocol for organizing and sharing information on the Internet via the client/server model is known as the HyperText Transfer Protocol (HTTP), and is more commonly referred to in a general sense as the World Wide Web (the web). Generally, the web links information by associating items of interest through the use of HyperText Markup Language (HTML) files, which reside on servers and usually are transferred to clients via HTTP. A user of the web may traverse it by receiving and viewing an HTML file (or just an image, video, etc.), which may contain within it information or embedded images, but which also may contain information on how to acquire further resources from the web, by, for example, incorporating URLs within the file. This information may be displayed to a user as a combination of text and media (for example images, sound, video) and generally is referred to as a “page” or “web page.” Generally, the user uses a client, called a web browser, to interact with the web and the various files found on it (e.g., HTML, audio and video files, etc.).
No central authority exists for cataloguing the hundreds of millions of network resources, such as HTML pages, files or media available within an intranet or the Internet. In general though, there are two approaches taken for finding information or resources of interest within a network: 1) a directory hierarchy and 2) a search engine.
Within a directory hierarchy a web page may be analyzed and categorized, allowing users to scan through various categories, and associated subcategories, to identify resources of interest. Alternatively, a search engine may provide a dataset of terms and phrases (keywords) upon which a user may query, and may return a listing of web resources associated with the keywords. Many such search engines are known in the art, with examples including, but not limited to, Google®, Yahoo® and Alta Vista®
A search engine generally includes two main parts: an index searcher and an index generator. An index searcher may include a database of indexing keywords of web pages and logic for searching the database. An index generator may include a “spider” for gathering web pages and an “indexer” for generating an index into those pages. Typically, a search engine works by sending out the spider to fetch web pages (by, for example, following the various links that exist on an initial set of web pages). The indexer may then read these pages and create an index based on the words contained in each page. Search engines typically use a proprietary algorithm to create their indices such that, ideally, only meaningful results are returned for each query. Provided with a page by a spider, an indexer may parse the document and insert selected keywords into the database with references back to the original location of the source page. How this is accomplished depends on the indexer. Some indexers index the titles of the web pages or just the first few paragraphs. Some parse the entire contents and index all words. Some parse available meta-tags or other special hidden tags. Meta-tags are special HTML tags that are meant to provide information about a web page. Unlike normal HTML tags, meta-tags do not affect how the page is displayed. Instead, they provide information such as who created the page, how often it is updated, what the page is about, and which keywords represent the page's content. Many search engines use this information when building their indices.
A common problem for these search engines is that they are, by necessity, automated. As such, the vagaries of human language may result in search results that are not always relevant to the query. For example, searching upon the keywords of “Miami” and “dolphins” may return web resources relevant to both a professional football team based in Florida, as well as aquatic mammals on display within the Miami locale. Further, automated search engines generally are poorly constructed to translate the context of web resources into a form searchable by keywords. For example, if searching for information regarding a consumer product, you are likely to receive web resources related to an individual consumer's experience with the product in addition to web resources which enable one to purchase the product. Finally, the relevance of any given web resource returned in response to a search engine query may be based upon a multitude of different factors, such as the number of web pages which refer to a given web resource, the number of times a given keyword appears within the text of a web resource, whether a person or corporation has paid the provider of the search engine to receive more favorable treatment, etc. Therefore significant effort may be required of the user in order to obtain relevant and preferred information via a search engine.
Furthermore, the Internet has voluminous resources and information sources available to it, yet the ability for an individual user to communicate or interact with a web resource generally is limited to that which the creator of the web resource allows. A user is limited in his ability to share or direct persons with whom he knows or shares a common interest; generally, he may either post a reference to the web resource on another web resource accessed by the persons he knows or accessed by those with whom he shares a common interest, or pass the URL to specific users or computers by direct communication, such as by electronic mail.
Many of the computers used today are capable of multi-tasking, and further provide a variety of user interfaces for controlling various and multiple application programs or system functions simultaneously operating in the computer environment. Personal Computers (“PC”) are particularly commonplace, operating with an operating system (“OS”) capable of multi-tasking such as Microsoft Windows™ or Apple Computer's MacOS™, or LINUX™. Smaller computing platforms such as held-held computers, personal digital assistants (“PDA”), and advanced wireless telephones may run operating systems capable of multitasking as well.
Users often wish to copy or transfer information or “content” from one program or system function within an OS environment, to another. Using “copy and paste” functions of the application programs and the operating system, the user may select information from a source program (e.g. a Web Browser receiving and displaying information received over an Internet), and “paste” it into the destination program (e.g. a text editing program or document creation program). The copy and paste process is described more fully in U.S. patent application Ser. No. 12/192,391, hereby incorporated herein by reference, in its entirety, including figures, to the present patent application.
With respect to the accessing of information through a network, for example an Internet, it is a problem in the present state of the art that people who publish content (text, images, audio, etc.) accessible within a network can easily have their content copied without their knowledge or authorization. The very functionality of the copy and paste within an OS make this easy in the digital world. Industry observers sometimes refer to this as ‘atomization’ of content.
Tools exist to help content publishers find when their content has been copied and posted on other websites or blogs, however, no tools exist to help content owners learn who is using simple cut and paste functions to copy data from their website within their PC, into products such as e-mail, Microsoft Word™, PowerPoint™ or other programs or system functions. It is currently impossible for publishers to monitor this cutting and pasting process because they have no ability to include attribution with the copied content. With monitoring and tracking, it is possible that publishers of content may be better able to monetize the copying and usage of their published content.