The invention relates generally to computer software, and more particularly, to the processing and storing of rich text data as legacy data records in a data storage system.
Large business operations often rely on legacy back-end computer systems to store data and provide common functions to different front-end systems. Furthermore, these operations may use applications that access data in the legacy back-end systems to provide continuous computing services to users when the organizations are not ready to migrate to modern data storage systems. As a result, rich text data such as those commonly found in Web based applications may continue to be stored in legacy databases and processed by legacy data-handling applications.
Legacy back end systems generally use simple data formats such as sequential records that have 80 plain characters on each record. This format originated from the days when data was entered into computers using punched cards that had the width for 80 punched characters on each card. A common feature of the legacy data storage systems is that multiple amounts of fixed width records are needed to store a quantity of text. Modern data, however, is much richer and may contain multilingual text, various fonts, styles, and colors for emphasis and expression. These data characteristics do not translate directly to plain text.