Enterprises, organizations, government departments, or individuals usually have storage systems for storing various data, such as work documents, emails, or texts or multimedia data produced by various other applications. Such a storage system may include a main storage system and/or a backup system. The storage system may not only include a storage device(s) for storing data, but also may include a data processing device(s) for performing functions of data replication, de-duplication, recovery, and the like.
In many use cases, it is also expected to perform a data analytic job on a big dataset in order to derive desired information from the data. The data analytic job is performed by various big data analysis systems such as Hadoop systems, which are developed as independent systems. Data stored in the storage system may be used as analysis objects of the big data analysis systems. The big data analysis systems need a dedicated storage space for storing to be analyzed data and intermediate results generated during the procedure of the data analytic job. Therefore, the target data to be analyzed need to be exported from the storage system and then imported to the dedicated storage space of the big data analysis system. Such data import and export will cost large time consumption and bandwidth consumption of data transmission across the systems.