Users Generate Content (UGC) is a new way users use the internet, i.e., changing from originally giving priority to download to both of download and upload important. For example, community network, Video sharing, blog and so on are the main application forms of the UGC. With the continuous development of the global Internet business, the UGC business is rising, and causes widespread concern in the industry.
Since data is generated by users, mass users give birth to a massive amount of data, and bring a massive amount of reading/writing at the same time. How to store these data, how to provide high concurrent read/write services, are problems necessarily faced in the field.
FIG. 1 shows a system architecture diagram of an existing distributed data storage system, which includes a storage identifier (ID) assignment system (or named as ID number release system) 120 and a data storage system 130.
The storage identifier assignment system 120 is responsible for assigning a storage identifier for data to be stored when a storage requester requests for storing the data. The storage identifier assignment system 120 ensures globally unique of the storage identifiers, and let the storage identifiers have a certain degree of randomness in one or some storage identifier segments (in some number segment ranges), this ensures to some extent the load balancing of the data storage system 130.
The data storage system 130 is responsible for storing data and providing read/write services, and includes an interface machine process module 131, a plurality of service process modules 132 and a plurality of storage modules 133. The interface machine process module is configured to receive read and write requests containing a storage identifier, and distribute the read and write requests to corresponding service process module 132, and shield details of configurations of the service process module 132; each service process module 132 is responsible for data storage in one or some storage identifier segments, and receives a storage request containing the storage identifier sent by the interface machine process module 131, provides read/write services of these data, and responds a successful response to the storage requester 110 after the data is stored successfully; the storage module 133 is configured to store and read/write data according to the instruction of the service process module 132.
FIG. 2 is a schematic flow chart of an existing distributed data storage method.
Referring to FIG. 2, when adding a new data, the above distributed data storage method includes the following steps:                Step 210: the storage identifier assignment system assigns a unique storage identifier for the data to be stored;        Step 220: the storage requester submits a request containing data to the interface machine process module according to the assigned storage identifier;        Step 230: the interface machine process module forwards the storage request to corresponding service process module according to a storage identifier segment which the storage identifier belongs to;        Step 240: the service process module instructs the storage module to store data according to the storage identifier, and responds a successful response to the storage requester.        
Moreover, when reading data, the above distributed data storage method can also include the following steps: the storage requester submits a read request containing a storage identifier to the interface machine process module, the interface machine process module distributes the read request to corresponding service process module according to a storage identifier segment which the storage identifier belongs to, the service process module instructs the storage module to return the data to the storage requester according to the storage identifier.
Further, when writing data, the above distributed data storage method can also include the following steps: the storage requester submits a write request containing a storage identifier and content to be modified to the interface machine process module, the interface machine process module distributes the write request to corresponding service process module according to a storage identifier segment which the storage identifier belongs to, the service process module instructs the storage module to write into the modified content.
The above distributed data storage system has the following disadvantages:                1. The coupling is high. The data storage system has dependent relationship with the storage identifier assignment system. First, the storage identifier assignment system needs to ensure the uniformity and of the storage logo and randomness of storage identifiers, once randomness of the storage identifier assignment system is broken, it may result in that one process performed by one service process module is crushed due to a sudden increase of the amount of write requests; when there is single point of failure in the storage identifier assignment system, the storage request of the whole distributed data storage system cannot be completed.        2. The design is complex. Two systems are equally important, in order to guarantee the normal external services, both of the two need a variety of disaster recovery designs.        3. The coupling and the design complexity directly cause an increase of the operation and maintenance costs.        4. There is single point of failure for new requests. When one process performed by one service process module is hung up, the new requests for a storage identifier segment for which the service process module is responsible will fail.        5. The bandwidth cost is increased. Every time new data is added, it is needed to obtain a storage identifier before the actual storage can be performed, there is one interaction more than the direct storage, and the bandwidth cost is doubled.        
Thus, a simple, efficient, low-cost storage service model is needed to solve the above technical problem, to provide stable, high concurrent mass data storage and read/write services for users. Such a storage service model will bring significant change in the technical field.