The process of cloning computer images into a central location creates a full copy of an image of a computing device and uploads the copy of the image to a central server in order to provide various services, such as full image backup, disaster recovery, and physical to virtual (P2V) migration projects. Therefore, centralization is likely the first step in an operating system (OS), hardware, P2V, and/or virtual desktop infrastructure (VDI) migration project, since a full clone of each computing device is made prior to starting the migration. As a full system backup solution, it benefits end users who can access their centralized data and applications from any device, as well as restore a full system image.
However, full cloning of computer images into a central location is a challenging task in many current Information Technology (IT) operations. One reason is the complexity of managing a large number of different desktop images that may exist on a set of computing devices. Another reason that makes full cloning of computer images into a central location a challenging task is that many large enterprises are dispersed over multiple geographic locations. The use of Local Area Networks (LANs) that are connected over one or more Wide Area Networks (WANs) with variable bandwidths and latencies is a serious barrier to providing efficient cloning of computer images without sacrificing the end user experience. Further, centralization is often applied on a large number of computing devices, which poses significant performance and scalability challenges. In particular, traditional systems require a significant amount of time, disk, network and central processing unit (CPU) resources to complete a “day zero” centralization, with much of the time is wasted on scanning and uploading files from each computing device. This bottleneck makes it hard for IT to complete migration projects on time and provides inferior service to users during the centralization process.
Traditional disk cloning creates a full clone of an image of an offline or online computing device and stores it on locally-attached storage or a LAN network share. However, these solutions are not designed for efficient mass centralization, especially when the computing devices are partially connected over WAN links (e.g. across multiple branches). In addition, full image cloning software usually uses the file-system interfaces to enumerate all files in arbitrary order, read their entire content, and store the files on local or remote storage devices. Further, while a common optimization technique in image cloning is to skip files whose content is identical to files in the central store, based on their data checksum (e.g. MD5), this requires reading the entire content of files before cloning them, reducing network traffic for the price of additional scanning of those files.