One of the fundamental aspects of most computer based systems is that they store and process data. This data can be stored locally or can be retrieved across a network.
A content delivery network or content distribution network (CDN) is a system of computers containing copies of data, placed at various points in a network so as to maximize bandwidth for access to the data from clients throughout the network. To perform this function, content distribution networks have an associated storage system. This system is responsible for the storage of content objects which the CDN is delivering or distributing. This system usually has two components:                Cache component, a distributed system of servers, close to the end users which the system is serving;        The second component is a repository(s) for all the content objects which the CDN is responsible for.        
The cache component consists of a set of servers in cache nodes and whose individual storage space is limited. The cache servers store a partial subset of content objects in the content distribution system and serve them to the end users. There is usually a cache decision logic which decides which content objects need to be stored in a particular cache server based on the goals of the CDN system like distribution cost bandwidth, latency, end user satisfaction etc.
A repository component acts as the main storage system for the CDN. All the content objects for which the CDN is responsible for is stored here and copies are moved to the individual cache servers based on the cache decision logic.
An important job of the main repository is to act as the cache for long tail content objects. These are content objects, which are lower in the popularity ranking and hence are long tail content and which are not stored in the system of edge caches but for which there are client requests. Since they are not stored in the caches, the CDN system can still honor the requests for these objects by redirecting the client requests to the main repository. Thus it acts as a Long tail server for the long tail content in CDN system.
Architecturally the easiest way to create this main repository is as a centralized storage system located at the highest levels (or core) of the CDN architecture. This is how most repositories are used in practice as well.
One problem is the single point of failure. Since all the long tail is in one single place, the repository, the CDN has a single point of failure, i.e. if there is an outage in the central server or the link connecting the repository to the rest of the system is disconnected. This will have a profound impact on the CDN system because then the clients will not have access to any long tail content and the CDN cannot replicate effectively.
Further one problem is the bandwidth and connectivity issues. Even though the number of requests for each of the long tail content objects is low, the total number of requests for objects in the long tail is quite high. This translates to a high connectivity and bandwidth requirements at the central repository. Also the fact that content replication also uses the central repository adds on to this bandwidth. These requests have to consistently pass to the core network adding to the bottleneck at the core. Another problem is related to the storage requirements at the central site, wherein the rather long nature (time stored and number of content objects) of the long tail content implies high storage requirements at the main repository.