Hadoop is an infrastructure of a distributed system, which may take full advantage of cluster high-speed operation and storage. Hadoop realizes a Hadoop Distributed File System (HDFS). The infrastructure of the HDFS may include a NameNode and multiple DataNodes, and the files stored in the HDFS may be divided into multiple data blocks which are stored into different DataNodes.
In order to prevent lost of a file caused by an accidental deleting performed by a user, the Hadoop sets a Trash in the NameNode. When a user deletes a file, the NameNode modifies a directory of the file to point to a directory of the Trash. When the user finds that the deletion is a misoperation and intends to recover the file, the NameNode may move the name of the file from the directory of the Trash to the original directory to recover the file. Therefore, when a client needs to view the file, the client may read the data block corresponding to the file from the relevant DataNode based on metadata of the file and a relevant mapping relation of blocks fed back by the NameNode.
The file deleted accidentally may not be recovered through the Trash set in the NameNode in some cases. For example, when a user empties the Trash, the NameNode may determine the data blocks stored in the DataNode which are included in the files in the Trash, and the NameNode may send a deleting instruction for deleting the data blocks to the DataNode, and the DataNode may delete the data blocks based on the deleting instruction. For another example, before a cluster is started, in a case that the metadata in the NameNode is deleted accidentally or is damaged, a memory of the NameNode may not include the metadata at the startup of the NameNode. After the DataNode is started, the DataNode may report data block information to the NameNode. Because there does not exist the data block information in the NameNode, the NameNode may send to the DataNode a deleting instruction for deleting the data blocks, and then the DataNode may delete the data blocks. Because the DataNode may delete directly the data blocks after receiving the deleting instruction, the deleted data blocks may not be recovered even if the user finds the accidental deleting in a short time, therefore the data security of Hadoop system is lowered.