The present invention relates to data processing systems, and more specifically to large-scale, long-running data transfer to data storage systems.
Large Internet companies such as Yahoo!, Inc. continuously generate, process, and transfer an enormous amount of data, including user data and web page data, from web searches to social relationships to geo-location data, and system data such as various performance metrics. Deriving useful information from the large volume of raw data supports a variety of service objectives, including presenting relevant contextual information, identifying trends in user behavior, and offering better targeted services.
Improved mechanisms for more efficiently handling large amounts of data would be beneficial.