The management of a telecommunications networks, such as cellular telecommunications networks, can require the processing of very large data sets. For example, the storage of traffic flow data records (FDRs), associated with customers of the telecommunications network, may require multiple petabytes of data. To economically store such a large data set, distributed storage and data processing techniques, designed for very large data sets that use commodity computing and storage clusters, may be used. The underlying storage architecture of some existing distributed storage and data processing techniques may be based on a write once read many (WORM) model.
It may be desirable to be able to efficiently process and update large data sets that are based on FDRs. For example, value added services and summary records may be provided to the telecommunications provider or to another entity. Updating the large data sets, however, on a per-record basis, may not be natively compatible with the WORM model.