Field of Invention
The present invention relates to the field of information processing, and more specifically, to a method, apparatus and system for processing webpage data in the information processing field.
Description of Related Art
With the development of network techniques and the increase of network resources, more and more users have selected to acquire information through networks. However, different links may point to webpages having duplicated or near duplicated contents. As a result, a user may repeatedly browse substantially the same information, causing a waste of time and affecting user experience.
For example, a webpage A and a webpage B may have several news-related links thereon, respectively. Supposing a link in webpage A and a link in webpage B point to duplicated or near duplicated contents but with different titles. If a user has browsed that link in webpage A, the user may find that he/she has already browsed the content after clicking on the link in webpage B. Because it is difficult for the user to determine beforehand whether the two links point to the similar content according to the links, such links of duplicated contents may cause a waste of the user's time and affect user experience. Furthermore, displaying duplicated contents multiple times on devices with limited power such as mobile devices is not only inefficient, but also a waste of systematic resources.