Data centers and cloud infrastructures are starting to deploy “Data Lake” architectures that are predominately based on distributed file systems. One example of a distributed file system is the Hadoop Distributed File System (“HDFS”). Distributed file systems are often highly scalable, can operate on low cost hardware, and support analytic algorithms. They lack, however, the content services found on more traditional systems.
There is a need, therefore, for a system, method, and process for providing content services on distributed file systems.