The present invention relates to advertisements on pages of a World Wide Web site. Specifically, it is directed to a method for dynamically assigning advertisements to web pages according to self-learned user information.
The internet is a network of networks and gateways that use the TCP/IP suite of protocols. A client is a computer accessed by a user or viewer which issues commands to another computer called a server. The server performs a task associated with the client""s command. The World Wide Web (WWW or Web) is the internet""s application which displays information on the internet in a user-friendly graphical user interface format called a Web page. A Web server typically supports one or more clients. The Web allows users (at a client computer) who seek information on the internet to switch from server to server and database to database by viewing objects (images or text) and clicking (with a pointing device or keystroke) on corresponding highlighted words or phrases of interest (hyperlinks).
The Web can be considered as the internet with all of the resources addressed or identified as Universal Resource Locators (URLs) and which displays the information corresponding to URLs and provides a point-and-click interface to other URLs. A URL can be thought of as a Web document version of an e-mail address. Part of a URL is termed the Internet Protocol (IP) address.
An internet browser or Web browser is a graphical interface tool that runs internet protocols and displays results on the user""s screen. The browser can act as an internet tour guide, complete with pictorial desktops, directories and search tools used when a user xe2x80x9csurfs the net.xe2x80x9d
With the recent explosion of web-related sites and services, the internet has become a great opportunity for web-related advertising. For site administrators who maintain a large number of pages, it has become almost necessary to use some process of optimizing placement of the advertisements and scheduling of the web pages.
Typically, advertisements occur as objects in the form of (text or inline) links on Web pages. Advertisers measure the effectiveness of contracting with a particular Web server (web site) by analyzing the number of times a viewer clicks (xe2x80x9chitsxe2x80x9d) on an advertisement. The cost of such a contract may be either directly or indirectly dependent upon this click rate. Consequently, from the point of view of the site administrator, it is desirable to maximize the total number of hits to the advertisements on the server.
Each page on a server may have a certain number of predefined slots which have standard sizes containing the object (inline image or text) links to the actual advertisement pages. The object may be an advertisement.
An advertisement is exposed when a page which contains the slot with the advertisement is served to a client accessing the page. Since a page may typically contain more than one slot, more than one advertisement may be exposed at a single time. This exposure of an advertisement is also called an impression.
An advertisement is clicked when a client decides to choose (with a pointing device or keystroke) the link corresponding to an exposed advertisement. Thus, the number of clicks for an advertisement is always a certain fraction of the number of exposures. Since the advertisement agencies measure the effectiveness of an advertisement by the number of clicks that an ad receives, and since the sum of the total number of exposures received over all slots by a site is defined by the traffic to that site, it is advantageous for a server to assign a particular advertisement to a slot in a manner such that the advertisement""s click/exposure ratio is maximized.
Presently, placement of advertisements on web pages is executed by using different variations of static assignment. Static assignment produces web pages with advertisements which generally do not change unless and until the site administrator adjusts them according to some historic information. This static method of placing advertisements does not take into consideration various real-time characteristics of the web which can improve placement optimization.
For instance, certain web pages in a web site are more likely than others to be accessed by users. Without knowing which pages are the present xe2x80x9chot spots,xe2x80x9d static assignment is likely to result in underexposure of certain advertisements and overexposure of others. Furthermore, some advertisements are much more likely than others to be accessed at a certain time of the day. Static assignment does not take this time-dependency into consideration. Finally, static assignment provides no way of taking into consideration the individual tendencies of a particular user to choose a specific advertisement based on appearance, size, shape or location on a page.
It is a primary object of the present invention to provide a method for dynamically assigning advertisements to appropriate slots on appropriate web pages based on a characteristic of the requesting client or user, depending upon self-learning data obtained from historical user behavior.
A further object of the present invention is to provide a method of scheduling advertisements which exploits popular web pages or xe2x80x9chot spots.xe2x80x9d
A further object of the present invention is to provide a method of placing advertisements which avoids the problem of over-exposure and under-exposure of advertisements.
A further object of the present invention is to provide a method of placing advertisements which takes into consideration the time of day.
The present invention overcomes the prior art limitations by providing a method for placing advertisements web pages which is capable of making placement decisions xe2x80x9con the flyxe2x80x9d depending on characteristics of the client or user from whom the request is received.
According to the present invention, the method for dynamically placing objects in slots on a web page in response to a current client request for the web page includes the steps of classifying users into two or more user groups based on at least one user characteristic, accumulating self-learning data based on user click behavior for each user group, matching the current request with a corresponding user group and scheduling real-time selection of the slots for the objects on the web page based on at least the self-learning data of the corresponding user group. The objects are preferably advertisements.
Preferably, a group click/exposure ratio is accumulated for each user group and the scheduling of real-time selection of the slots for the objects on the web page is based on the group click/exposure ratio for the corresponding group.
The method preferably includes the step of generating probabilistic assignment data for each user group based on a contract requirement of the objects wherein the scheduling is further based on the assignment data.
Preferably, the method also includes the step of generating probabilistic assignment data for each user group based on an exposure requirement of the object for a period of time.
It is preferable that the method further includes the step of generating probabilistic assignment data for each user group based on a popularity characteristic of the web page, such as a number of clicks on the web page for a period of time.
Because processing resources are limited in a web server, the classifying step of the present invention preferably includes the steps of probabilistically choosing a fraction of previous requests and classifying users of the fraction of previous requests into two or more user groups. In web sites which have a registration procedure (for example the New York Times), explicit demographic information is available, and this may be used in order to perform advertisement placement. Thus, according to a preferred embodiment of the invention, classification of users is executed based on user demographic information.
According to a second and third embodiment of the present invention, classification of users is executed based on user click behavior, such as user click/exposure ratios and user path traversal patterns. For example, an advertisement about job listings is likely to be very relevant to a person accessing the web page from the .edu domain. Similarly, for a car dealer in the Massachusetts area, it is more relevant that the advertisement be accessed by clients within that area. Such similar behavior on the web is utilized by these methods of user classification.
Classifying is preferably executed by using an efficient multi-dimensional clustering algorithm to classify users into user groups and is also preferably based on the particular time of the day. By taking into consideration the time of day of the request, the placement""s effectiveness is maximized. Indeed, the same page should contain different advertisements depending upon the time of the day.
Preferably the method of the present invention minimizes repetitive exposure of objects to users by further basing the scheduling on exposures of objects on previous web pages requested by a same user so that repetitive exposure of the same object is controlled.
The size, appearance and/or position of said objects are preferably varied during scheduling and the variation is preferably traversal path dependent.
Since the method of the present invention is a self-learning method, classification of the users preferably includes the steps of collecting user characteristic data, such as click/exposure ratio data, based on previous object assignments, analyzing the user characteristic data so that new user characteristic data is discovered and classifying the users into two or more user groups based on the new user characteristic data. Preferably, a fraction of the previous assignments are randomly made to provide unbiased learning.
It is also preferable for the classification of users to be further based on a sensitivity of the corresponding user group to variations in size and location of the slots on the web page. Preferably, the method of the present invention further includes the step of collecting statistics representing the impact of different slot sizes and locations on click/exposure ratios for the user groups.
To improve user classification efficiency, the method preferably includes the step of classifying objects and web pages into classes. This may be done by assigning keywords to classes of objects and web pages and classifying objects and web pages into classes, or by selecting a set of popular web pages with high click rates, classifying objects with similar click/exposure ratios on the set of popular web pages into object groups, classifying web pages experiencing similar click/exposure ratios for objects in each of the object groups into page groups, adding web pages to the set, and repeating the classifying steps.
It is preferable in the method of the present invention to randomly schedule a fraction of the objects to web pages so that self-learning is improved. It is also preferable to schedule on a web page different numbers of slots with differing sizes and locations on the web page to improve self-learning.
It is preferable to generate the assignments of the objects by assigning each of the objects to an object node in group O, assigning each of a the web pages to a page node in group P, providing an arc between a page node and at least one of the object nodes as a function of a classification of the objects and the web pages, assigning an object node flow requirement to each object node based on periodic contract requirements of a corresponding object, assigning a page node flow requirement to each page node based on an expected popularity of a corresponding web page, introducing a flow supply node having an arc to each object node, said flow supply node providing a supply flow, assigning a flow weight to each arc between the page nodes and the object nodes based on a function of group click/exposure ratios resulting from placing the corresponding objects on the corresponding web pages, and assigning a flow to each arc between the object nodes and the page nodes with a probabilistic assignment method so that the supply flow flows through the arcs to the page nodes and in-flow equals out-flow for each object node and in-flow is less than the node flow requirement for each page node, wherein a total return of the assignment is maximized.
Finally, it is preferable to include the steps of maintaining cumulated statistics on exposures for the corresponding user group within a required range of tolerance as a result of the probabilistic assignment data and choosing objects for the slots based on a function of the cumulated statistics and the range of tolerance. Preferably, the objects are chosen for the slots based on a largest outstanding requirement as dictated by the difference between the cumulated statistics and a targeted rate for the corresponding user group.