Online information is typically displayed for a user in response to a keyword search conducted by the user or simply as recommendations of possible interest to the user. Conventionally, what is displayed from one or more categories of information for a user and how well it aligns with the user's interest is typically determined based on statistical user click action information. For example, user click action information with respect to information obtained through keyword searches and/or with displayed information obtained for other reasons may be recorded and statistical analysis may be performed on such information. Pieces of information may refer to content such as audio files, language, text, graphic images, animation and/or other types of media, while an information category may be thought of as a common attribute possessed by multiple pieces of information. For example, a common attribute possessed by the text information “tops,” “dresses,” “shorts,” and “pants” can be “clothing” and so the information category of such information may be “clothing.” In another example, a common attribute possessed by images that depict mountains, images that depict oceans, and other such image information can be “scenery” and so the information category of such information may be “scenery.” Generally, in the search field (e.g., of a search engine), entered keywords may be viewed as related to information categories, and the pieces of information obtained based on keyword searches may be information that are actually included in the information categories that match the keyword(s) of the search. For example, “Hilton Hotel,” “Shangri-la Hotel,” and “Grand Hotel Beijing,” which are at least some pieces of information organized under the information category of “hotels” may be obtained based on a search for the keyword “hotel” or “hotels.”
Determining the correlation between pieces of information and the information categories under which they are organized plays an important role in determining keyword search hit rates, determining information ranking/display accuracy rates, and adjusting which pieces of information are to be displayed for information categories (e.g., a piece of information that has a low correlation with the information category may be deleted from that category).
As mentioned above, the typical statistical analysis used to determine the correlation between pieces of information and information categories is generally based on user click action information. However, such statistical analysis overlooks the distinction between pieces of information that were displayed and selected (e.g., by one or more users) and pieces of information that were displayed but not selected. Ignoring such a distinction could contribute to lower accuracy of the correlation of information that is determined.
As a result of the low accuracy of the determined correlations, keyword search hit rates, information ranking/display accuracy rates, and determining adjustment of information are also less accurate, which leads to a waste of resources.