Modern data centers often include storage systems, storage controllers, mass storage devices, and other devices for managing, storing, and providing access to data. These data centers often provide data services to geographically distributed users. The users often have widely varying storage and access requirements. Many users work at core sites or in facilities with significant computing and network resources. At the same time, other users at edge or remote locations may have limited access to computing resources and/or network connections. Remote and edge locations may have unreliable, slow, or intermittent network connections. In some cases, network access may only be available through relatively expensive wireless means and/or may need to be used sparingly for budgetary reasons. Network connectivity may also be intermittent for the increasing number of employees who work from home offices and mobile locations.
In some cases, dedicated storage equipment is implemented at edge locations in order to minimize the negative impacts of network outages and latencies. However, implementing dedicated storage devices at remote or edge locations may not be feasible due to equipment costs, support costs, lack of sufficient or reliable power, the number of locations, security issues, and/or availability of physical space. These issues often present even bigger challenges for mobile employees. Transporting and setting up the additional dedicated storage equipment at each work location would be unfeasible in many cases.
For example, a radiologist may work from home or another remote location. The radiologist may also provide services to several geographically distributed medical facilities. The radiologist and the medical facilities need shared and reliable access to medical images and other related data. However, this access must also be carefully controlled for reasons of privacy and regulatory compliance. In many cases, every request for a medical image or other data requires sending a request for the data to the core storage location and receiving the data over a network connection. A slow or interrupted network connection can have significant impacts on the radiologist's productivity, the effectiveness of other related medical service providers, and/or the timeliness of care.
In remote sensing applications, computing devices are often installed at remote locations to gather data. Network connectivity at these locations may be minimal and the environment may not be suitable for installation of supplemental storage and processing equipment. Implementing dedicated storage hardware at these remote locations may not be feasible for cost, environmental, or other reasons.
In some cases, a dedicated storage device, such as a cloud gateway, is installed at the remote location in order to facilitate data access. However, these devices only provide access to a dedicated namespace of data at the core storage location and do so at the cost of additional hardware. A namespace is a logical grouping of identifiers for files or data stored in a data storage system. In many cases, a namespace may be shared across multiple systems or users. Datasets in dedicated namespaces are not easily available for access and/or modification by multiple users. Shared namespaces are typically stored in centralized locations in order to provide data access for multiple users. Some solutions cache currently or recently accessed files at the remote location making them available regardless of network connectivity. However, currently or recently accessed files are typically only a small subset of an entire shared namespace of data. A user may need to access larger or alternate subsets of the data during periods when a network connection is unavailable or has insufficient bandwidth to provide effective real time access. In addition, dedicated hardware devices like cloud gateways often impose other limitations including additional power, space, mounting, thermal, air filtration, and/or security requirements. In addition, these dedicated hardware devices cannot be easily or quickly scaled to meet changing needs.
In addition to the connectivity issues described above, centralized data access may be challenging due to the evolving nature of computing and storage systems. While an organization may ideally prefer to have all of their data managed within a single framework and/or file system, the evolution of technology often means that data may be spread across multiple systems. It is desirable to provide simplified access to these users while still maintaining proper access control. All of these issues present challenges to providing users, particularly users at edge or remote locations, simplified and reliable access to shared data across multiple systems. These challenges are likely to continue due to the combination of increasingly distributed workforces, data-centric work content, a continuing move towards centralized data management, and constantly evolving data systems.