US 12,169,477 B2
Unreliable edge
Lailong Luo, Changsha (CN); Geyao Cheng, Changsha (CN); Deke Guo, Changsha (CN); Junxu Xia, Changsha (CN); and Bowen Sun, Changsha (CN)
Assigned to NATIONAL UNIVERSITY OF DEFENSE TECHNOLOGY, Changsha (CN)
Filed by NATIONAL UNIVERSITY OF DEFENSE TECHNOLOGY, Changsha (CN)
Filed on Apr. 24, 2023, as Appl. No. 18/138,144.
Claims priority of application No. 202211255400.8 (CN), filed on Oct. 13, 2022.
Prior Publication US 2024/0126722 A1, Apr. 18, 2024
Int. Cl. G06F 16/174 (2019.01); G06F 16/16 (2019.01); G06F 16/17 (2019.01); G06F 16/172 (2019.01); G06F 16/182 (2019.01)
CPC G06F 16/1748 (2019.01) [G06F 16/172 (2019.01)] 18 Claims
OG exemplary drawing
 
1. A method for deduplication caching using an unreliable edge resource, comprising the following steps:
acquiring a total storage capacity of all edge servers;
searching for candidate cache files by a similarity-based hierarchical clustering (SHC) method, and acquiring file clusters of all the candidate cache files after clustering, wherein the candidate cache files each comprise a deduplicated data chunk; and
based on the file clusters and a reliability of all of the edge servers, selecting, by a heuristic algorithm, a file cluster from the file clusters to cache to an edge server until a size of cached content reaches the total storage capacity,
wherein the searching for the candidate cache files by the SHC method, and the acquiring of the file clusters of all the candidate cache files after clustering comprises:
determining, by a hierarchical clustering method based on a Jaccard index, whether a sorting index of two files after clustering is greater than sorting indexes of the two files before clustering, in each iteration of an iterative clustering process;
if yes, merging the two files into a new cluster;
determining a heat rate of the new cluster, and recalculating a file availability based on a chunk location in the new cluster; and
acquiring the file clusters after all iterative clustering is completed.