All of the publications, patents and patent applications cited within this application are herein incorporated by reference in their entirety to the same extent as if the disclosure of each individual publication, patent application or patent was specifically and individually indicated to be incorporated by reference in its entirety.
In recent years the popularity of computers, and the communication networks established between these computers, have increased dramatically. Such communications networks allow computer users, either in a business, government or personal setting, to communicate with each other, either through a centralized communication point, through a plurality of distributed and redundant communication points, or directly. This allows exchange of information between the computers on the communication network, using a common communication protocol between them. It is common for corporations or business to establish a common communications network between their computers, otherwise referred to as “intranets”, in which the communication network has limited or no access to unauthorized persons and/or computers. It is common for intranets to be protected by security systems, such as firewalls, which prevent access by unauthorized users of the communications network, the computers communicating through it, and the information contained within these computers.
The term “Internet” has been adopted to describe the publicly available network which has nearly worldwide coverage, and to which most personal computers have access. The pervasive nature of the Internet, combined with the lower cost and increased performance of personal computers, has led to it being a popular source of information. Systems are available which provide an individual with the ability to search for information or resources within the Internet. By way of non-limiting example, systems exist which allow a user to search for information stored on other Internet computers (e.g., servers), thus providing generalized access to these resources. Unfortunately, when an individual is searching for specific information, the resource on the Internet may not provide the specific information desired by the individual, or else it may provide certain information in an undesired context. The individual may then continue searching, or else use an alternate system to perform the required searching activities. In general, these searching systems provide minimal ability for a user to provide feedback as to the success of the search, or ways for the user to refine future searches. Generally, the user establishes a series of search terms to initiate a search, and upon failure of the search results to provide the user with what he is looking for, the user modifies or adds further search terms in an effort to increase the chance of success on the next search. Alternatively, the user may switch to an alternate search system and attempt to obtain a successful search result using that second system.
Computers communicate within a network using a common set of standards for exchanging data. One common example is the Transmission Control Protocol/Internet Protocol (“TCP/IP”) suite. To initiate communications within the communication network, a user (referred to also as a client) may contact another computer on the network (e.g., a server) and request information or a resource. This is facilitated by various software and hardware systems generally available. A user can access resources within the Internet by being directed through software (e.g., by clicking a hyperlink), by entering a Universal Resource Locator (“URL”), etc.
A popular protocol for organizing and sharing information on the Internet via the client/server model is known as the HyperText Transfer Protocol (“HTTP”), and is more commonly referred to in a general sense as the World Wide Web (the web). Generally, the web links information by associating items of interest through the use of HyperText Markup Language (“HTML”) files, which reside on servers and usually are transferred to clients via HTTP. A user of the web may traverse it by receiving and viewing an HTML file (or just an image, video, etc.), which may contain within it information or embedded images, but which also may contain information on how to acquire further resources from the web, by, for example, incorporating URLs within the file. This information may be displayed to a user as a combination of text and media (for example images, sound, video) and generally is referred to as a “page” or “web page.” Generally, the user uses a client, called a web browser, to interact with the web and the various files found on it (e.g., HTML, audio and video files, etc.). The browser may be implemented through execution of a program operating on a computer, such as a personal computer, cellular telephone or other mobile device.
No central authority exists for cataloguing the hundreds of millions of network resources, such as HTML pages, files or media available within an intranet or the Internet. In general though, there are two approaches taken for finding information or resources of interest within a network: 1) a directory hierarchy and 2) a search engine.
Within a directory hierarchy a web page may be analyzed and categorized, allowing users to scan through various categories, and associated subcategories, to identify resources of interest. Alternatively, a search engine may provide a dataset of terms and phrases (keywords) upon which a user may query, and may return a listing of web resources associated with the keywords. Many such search engines are known in the art, with examples including, but not limited to, Google®, Yahoo® and Alta Vista®.
A search engine generally includes two main parts: an index searcher and an index generator. An index searcher may include a database of indexing keywords of web pages and logic for searching the database. An index generator may include a “spider” for gathering web pages and an “indexer” for generating an index into those pages. Typically, a search engine works by sending out the spider to fetch web pages (by, for example, following the various links that exist on an initial set of web pages). The indexer may then read these pages and create an index based on the words contained in each page. Search engines typically use a proprietary algorithm to create their indices such that, ideally, only meaningful results are returned for each query.
Provided with a page by a spider, an indexer may parse the document and insert selected keywords into the database with references back to the original location of the source page. How this is accomplished depends on the indexer. Some indexers index the titles of the web pages or just the first few paragraphs. Some parse the entire contents and index all words. Some parse available meta-tags or other special hidden tags. Meta-tags are special HTML tags that are meant to provide information about a web page. Unlike normal HTML tags, meta-tags do not affect how the page is displayed. Instead, they provide information such as who created the page, how often it is updated, what the page is about, and which keywords represent the page's content. Many search engines use this information when building their indices.
A common problem for publishers of web pages or creators of network resources is that there is a benefit to keeping users within a given web site, or within a collection of web sites under common ownership; generally this is driven by the acquiring of revenue through advertising presented in conjunction with the content present in the network resource. Therefore it is desirable for a web publisher to attempt to deter a user from leaving a particular web site, or collection of web sites; and instead direct the user to a resource within the given web site or a collection of web sites.
A further issue for those providing content within a network, or those who are reviewing content available on a network, is using network content to derive information on trends within a region, culture, geographical location, country or the world in general. The addition of content, or changes to the search terms used by a population, may represent a change in thought, or increased interest in a population on certain issues, information or opinions. This is highly relevant and valuable information and there are advantages for parties who are able to quickly identify trends or changes to a population's interests, thoughts or opinions.
Many of the computers used today are capable of multi-tasking, and further provide a variety of user interfaces for controlling various and multiple application programs or system functions simultaneously operating in the computer environment. Personal Computers (“PC”) are particularly commonplace, operating with an operating system (“OS”) capable of multi-tasking such as Microsoft Windows™ or Apple Computer's MacOS™, or LINUX™ Smaller computing platforms such as held-held computers, personal digital assistants (“PDA”), and advanced wireless telephones may run operating systems capable of multitasking as well.
Users often wish to copy or transfer information or “content” from one program or system function within an OS environment, to another. Using “copy and paste” functions of the application programs and the operating system, the user may select information from a source program (e.g. a Web Browser receiving and displaying information received over an Internet), and “paste” it into the destination program (e.g. a text editing program or document creation program). The copy and paste process is described more fully in U.S. patent application Ser. No. 12/192,391 (20080300859), incorporated by reference, in its entirety, including figures, to the present patent application.
There is a significant interest for those parties making content available on a network, such as an Internet, to provide opportunities for persons accessing a network resource to purchase goods or services as a follow-on action. It is a reasonable assumption that parties accessing a network resource with content relating to a particular topic will be amenable to purchasing goods or services directly or indirectly related to that topic. Therefore advertising is often displayed in association with a network resource generally made available to the public, the advertising displayed selected based upon the content of the network resource, the referral link of the accessing party, the past history of accessing network resources of the party (using, by way of non-limiting example, “cookies” as are known in the art) and combinations thereof, as currently known in the art. This has been further refined in the current art wherein individual words or phrases within the content of the network resource are identified to the party accessing the network resource as differentiated from the majority of the text, so as to entice the party accessing the network resource to “click” or otherwise elect to be transferred from the network resource to another.
It is commonplace that a user is directed to a “landing page” as it is known in the art, or a network resource that presents content which is a logical extension of the advertisement, differentiated word or phrase, or search engine search result. Such landing pages may be static, in that the information presented is the same for all users until modified by a human or automated means; or dynamic, in that the landing page is generated through automated means immediately preceding or contemporaneous with a user accessing the landing page. Dynamic web pages may utilize the referral link driving the user to the landing page, past history of network resource access of the user, geographic location, computer system information, or any other information obtainable on the user in order to generate the landing page. See for example U.S. patent applications #20100042635, #20080027812, #20040044566, #20080040389, #20080091526 and U.S. Pat. Nos. 7,281,042 and 7,523,087; which are herein incorporated by reference, in their entirety.
With respect to the accessing of information through a network, for example an Internet, it is a problem in the present state of the art that people who publish content (text, images, audio, etc.) accessible within a network can easily have their content copied without their knowledge or authorization. The very functionality of the copy and paste within an OS make this easy in the digital world. Industry observers sometimes refer to this as ‘atomization’ of content.
Tools exist to help content publishers find when their content has been copied and posted on other websites or blogs, however, no tools exist to help content owners learn who is using simple cut and paste functions to copy data from their website within their PC, into products such as e-mail, Microsoft Word™, PowerPoint™ or other programs or system functions. It is currently impossible for publishers to monitor this cutting and pasting process because they have no ability to include attribution with the copied content. With monitoring and tracking, it is possible that publishers of content may be better able to monetize the copying and usage of their published content.