1. Field of the Invention
The present invention generally relates to a method and system for backing up data while extending the life of local hard drives and reducing power consumption, heat and noise. Specifically, the present invention provides a system and method for reversed backup operation for keeping local hard drives in a stand-by mode (that is, not spinning) thereby extending the life of local hard drives and reducing power consumption, heat and noise produced by the local drives.
2. Related Art
Hard disks are mechanical devices. (A hard disk or hard disk drive (HDD), commonly referred to as a hard drive, hard disk or fixed disk drive, is a non-volatile storage device which stores digitally encoded data on rapidly rotating platters with magnetic surfaces. For more information about HDDs, see http://www.techweb.com/encyclopedia/defineterm.jhtml?term=harddisk.) When their disks are spinning, HDDs consume power, generate noise, and are sensitive to shocks and impacts. Hard drives consume from 5% to 31% of the desktop and notebook computer power. This not only increases the overall IT power consumption and reduces battery life of portable computers but also increases the amount of heat generated by the system. Excessive heat can cause discomfort to the user and increase the power consumption of fans to remove the heat from the computer and air conditioners to cool the ambient air in the work environment. Office computers consume about 1% of the electricity produced in the United States with additional 1.2% consumed by the data centers. Given the expected growth of the number of computers, these fractions are expected to significantly increase in the future. The situation is especially critical in major cities where the electricity prices are the highest and the power supply increases may not be possible at all. In these cases, the only possibility to expand the businesses is to lower the existing power consumption.
Even low intensity office noise (e.g., from the spinning of HDDs in workstations or laptops) reduces productivity, motivation, ability to learn, and can even cause health risks. Noise levels of the average desktop hard drives rotating at 5,400-7,200 RPM are typically between 30 and 50 dBA. High-performance hard disks with higher rotational speeds generate more noise. Disk head seek-related sounds distract the users even more due to their irregular nature. (To read and write to the surface of the disks, the drive uses a small electro-magnet assembly, referred to as a head, located on the end of an actuator arm. There is one head for each platter surface on the spindle. The disks are spun at a very high speed to allow the head to move quickly over the surface of the disk. Towards the other end of the actuator arm is a pivot point, and at the end is a voice coil, which moves the head.) While one may argue that it is possible to manufacture computers with better sound-insulation, there is a need for a system and method which reduces the level of office noise with existing hardware.
Furthermore, spinning hard disk drives are fragile and sensitive to shocks. (Workstations and, of course, laptops are much more susceptible to being subject to a physical impact as a result of the workstation/laptop or, especially in the case of the laptop, being dropped.) The relative speeds of the disk heads moving over the disk platters depend on the disk speed and can exceed 150 miles per hour for 15 KRPM drives. In case of an abrupt impact or vibration, disk heads can touch and damage the platter surfaces.
Gyroscopic effects exacerbate the problem for mobile devices. Even small disk platter damage can create a chain reaction of collisions of the particles scratched from the platters, the disk heads, and the disk platters. Therefore, even a single collision of the disk drive head and the platters can result in the rapid loss of the drive and loss of the data.
Non-rotating disks in the stand-by mode consume an order of magnitude less power and generate less heat than busy disks and at least three times less than idle rotating disks. For the purposes of this application, the term “stand-by” refers to non-spinning or non-rotating disks. Further, non-rotating, or non-spinning, disks are silent. They are also typically four to five times less sensitive to shocks and, thus, are more reliable. Therefore, it is desirable to minimize the time the disks are spinning.
Most solutions to the “rotating hard disk” problem that exist today target either portable devices, servers, or RAID controllers. “One media” solutions are trying to solve the problem with only one storage device. Unfortunately, it is impossible to predict the future (including the disk access patterns), and delaying writes increases the risk of losing data. Data read mispredictions frequently increase power consumption compared to the systems where the disk drives are spinning all the time. Also, it takes several seconds for the user to wait for the disk drive to spin-up in case of a read request that missed the caches. Delaying data writes increases the chances of the data loss and frequently spinning up the disk significantly increase the hard disk wear. For example, desktop hard disks can sustain only about 50,000 total spin-up procedures before they fail. Flash memory consumes little power and is fast. However, it is small and can sustain only a limited number of writes. Hybrid drives contain built-in nonvolatile flash memory. It increases the amount of memory available for caching and allows to persistently store some amount of data writes without spinning up the disk drive. Therefore, hybrid drives allow to prefetch more data and delay the data writes to the disk without sacrificing the data reliability. Unfortunately, hybrid drives only partially solve the above problems. Read request mispredictions and large volumes of writes still require accessing the disk. Also, hybrid drives are hardware solutions that require replacement of the existing hard drives, are hard or impossible to upgrade if the flash memory wears out and operate on the data blocks and thus have no access to the file-system-level meta-information. (Meta-information is necessary for the data prefetching optimizations.) Experimental driver-level solutions that work with a hard disk drive and a flash drive as two independent devices are free from the first and the second problems.
Disk-less servers, workstations, and thin clients use remote storage instead of the local hard drives. Remote storage systems usually consist of many hard disks. Such systems can distribute the data on the disks according to its popularity, use multi-speed disks or disks with different characteristics, and can dedicate some disks for write and read caches. This frequently allows the ability to keep a significant percentage of the number of disks off. Unfortunately, disk-less clients require permanent and high quality network connectivity. Therefore, this technique is not suitable for most mobile systems. Also, disk-less systems are less common and thus harder to configure and support for users and administrators and, thus, systems using hard drives are most prevalent.
Hard disks in most notebook and server systems are kept in the idle mode even when no read or write requests are being served. Systems that do put their hard disks into the stand-by (non-spinning) mode frequently add more problems than they solve.
Completely diskless clients add inconvenience for the users and administrators. Desktop solutions have high latency and become unusable in case of network problems. Also, such systems have different administration processes, which is not confined by the machines themselves. That is the reason why diskless desktops and servers have limited adoption. With the recent increase of the sizes of flash memory, it is expected that flash memory may replace the system disks. However, the sizes that are available today and at least in the near future are still much smaller and more expensive than users need.
Solutions that combine multiple (possibly different) disks were shown to be more effective for server-type workloads. Unfortunately, servers and desktops have only one (system) disk. A combination of flash memory and hard disks partially solves the problem but still can result in shortened life-time and long access latencies in case of the flash memory read misses. Previous attempts to augment the disk and flash with the network connectivity to store the data were shown to improve performance and prolong battery life on mobile systems. However, they can shorten the disk life-times and increase power consumption on the server and, as a result, overall on the enterprise scale.
Data reliability and availability are usually the most important requirements for storage systems. Traditional power optimization solutions frequently contradict these requirements and decrease user and administration convenience. For example, frequent spin-up and spin-down operations increase data reliability and availability but significantly decrease the life-time and thus reliability of the hard disk drives. As a result, these features are usually disabled or configured for about hour long time-outs on most servers and desktops. Notebook hard disks can survive about an order of magnitude more spin-up operations but will still wear out within a year if only the break-even balance of power is considered. Similarly, diskless clients degrade performance and become nonoperational in case of network infrastructure problems.
In addition to power consumption, hard disks pose a set of other problems such as noise, fragility, and ease of being stolen or lost. However, servers, desktops, and mobile systems have different disks and different deployment scenarios, which make some of the above problems important or completely unimportant. For example, a disk in a notebook consumes almost no power in the idle state and its power consumption optimization not only makes no sense at the enterprise scale but usually has negligible effect on the battery life. A desktop in the enterprise is almost always reliably connected to the fast local network whereas a notebook can get disconnected at any time.
There is a need for a client file system which provides the following functions:
1. provides run-time data protection (CDP or at least replication) of each hard disk in the enterprise, even when a desktop loses connectivity due to temporary network problems or when a mobile client is away from the network infrastructure, without significantly increasing the cost of required backup storage;2. spins the local hard disks up for short periods of time and only several times a day;3. provides data access latency and bandwidth similar to the operation with the local disks at least under typical user workloads; and4. requires minimal hardware and software modifications in existing infrastructure.
Hard disks fail, fail inevitably and unexpectedly. People make mistakes and overwrite or delete useful data. Hard disks or whole computers get lost or stolen. Data backup systems try to minimize the consequences of these harmful events. Traditional backup systems create snapshots of a subset of files on a periodic basis. This poses two problems:
1. Some important data may be left unprotected due to subset of files selection mistakes (which is usually realized when it is already too late); and
2. the most recent (and thus frequently most important) data updates are not protected.
The first problem could be solved by backing up whole hard disks. However, it is usually time consuming and considered prohibitively expensive because of the expensive storage systems used for backups. Also, increasing the amount of backup storage increases the enterprise power consumption.
The second problem is partially solved by the run-time data replication. In addition, reverting to an earlier version of the file is frequently desirable. For example, if a user deletes a portion of the document by mistake, he may need to revert to an earlier version of the file to recover the deleted portion. Continuous Data Protection (CDP) preserves backup copies for every data update on-the-fly. This allows users to roll-back any file to any previous state in time. Unfortunately, mobile users are still left unprotected when not connected to a reliable network link.
Therefore, there exists a need for a solution that solves at least one of the deficiencies of the related art.