In an existing storage system, a hard disk drive (HDD) and a solid state disk (SSD) are generally used as storage media. Faults that may occur in such storage medium include some repairable faults, such as a check error (UNC) and a sector identifier error (IDNF). A repairable fault may generally be repaired by rewriting new data.
For a distributed storage system in which data has backup data, the backup data is distributed in different servers. If a repairable fault such as a UNC or IDNF occurs in a primary server, the primary server requests a secondary server that stores backup data of a faulty area to send the backup data, and overwrites the faulty area with the received backup data to complete fault repairing. Similarly, if a repairable fault such as a UNC or IDNF occurs in a secondary server, the secondary server sends a request to a corresponding primary server, and completes fault repairing according to backup data received from the primary server. However, when a fault occurs in same backup data in both the primary server and the secondary server, repairing of a faulty area cannot be completed.
In a case in which repairing cannot be completed, the primary server or the secondary server may re-receive a request for reading data in the faulty area, re-schedule an operating system (OS) input/output (IO) channel to access the faulty area in a disk, further re-start a fault repairing process, and then return that fault repairing fails. Repeated IO scheduling and repairing processes waste a large number of system resources.