A large-scale website system usually stores and retrieves data through a distributed cache structure. For example, the taobao website (i.e., tabao.com) stores and retrieves images uploaded by users through a distributed cache structure. Normally, a distributed cache system includes a source data server, multiple cache servers which communicate with the source data server, and a dispatcher. Responsive to a user's request, the distributed cache system generally uses the dispatcher to determine which cache server the user may obtain data through a consistent Hashing algorithm that is based on the received user's request. If the cache server so determined has the data, the determined cache server returns the data to the dispatcher. If the determined cache server does not have the data, the determined cache server requests the data from the source data server, stores the data therein, and returns the data to the dispatcher which returns the data to the user.
In an existing distributed cache system such as an image system of taobao website, when a seller uploads a malicious image (e.g., an infringing image or an illegal image), the image is first uploaded to a source image server. When a network user accesses the malicious image through a link of taobao network, he/she first accesses a certain image cache server that is determined based on the consistent Hashing algorithm. If that image cache server does not have the malicious image, the image cache server obtains the malicious image from the source image server and stores the malicious image. When the existence of this malicious image is detected, the system will need to remove this malicious image. Existing technology performs removal operations for that malicious image in the source image server and all the image cache servers of the distributed cache system.
During the study of the existing technology, inventors of this application have noted that the removal operation of the existing technology greatly increases the burden of servers and wastes resources of the servers because all the servers need to perform such a removal operation, though not all the cache servers have malicious data. This is especially true for a system including a large number of cache servers. Since the removal operation is needlessly performed in cache servers that have no malicious data, the overall performance of the distributed cache system is reduced.