1. Field of the Invention
The present invention generally relates to a form generation system including an information processing apparatus configured to generate print data and a printing apparatus configured to receive the print data, which are in communication with each other via a network. The present invention also generally relates to a print management system, a printing apparatus, an information processing apparatus, and a method for controlling the systems and apparatuses.
2. Description of the Related Art
In recent years, high data security has become desired by the market. To provide data security, certain conventional systems may store and manage print data and print log information that are associated with each other, in order to allow for tracing of a path of leakage of confidential information that is printed by an unauthorized user for a falsified purpose or by an authorized user by mistake. Here, examples of the “print log information” may include various information such as a user name, a client personal computer (PC) name, a print document name, and a date and time of printing.
In such a conventional system, if an information leakage is detected, the system searches its storage area for print data similar to the leaked text data or image data and refers to print log information about the print data extracted as a result of the search that is highly similar to the leaked information. Such a system may be referred to as a “document management system”.
The above-described system may include a document management client unit and a document management server unit. The document management client unit operates on a printer while the document management server unit operates on a general-purpose PC or a server apparatus (a server-dedicated computer). In such a system, the document management client unit and the document management server unit may be in communication with each other via a network.
The document management client unit acquires print data for a document to be printed that has been sent from a client PC to the printer and sends the acquired print data to the document management server unit together with the print log information.
Then, the document management server unit divides the received print data in the unit of a page, then divides each page into text areas and image areas to generate search data for each area. After that, the document management server unit mutually associates and integrates the print data for one page, text area information, image area information, text area search data, and image area search data to generate storage data for each page.
Furthermore, the document management server unit mutually associates and integrates original print data and the page-by-page storage data for each page to generate storage data for each print data. Then, the document management server unit stores the generated storage data for each print data in a storage unit.
Japanese Patent Application Laid-Open No. 2006-081119 discusses a method for recording link information that is linked with the stored print data, in print log information in the case of reprinting a print document that has been already printed and whose print data already exists in a document management system, instead of storing the same print data again in the document management system.
Moreover, in a company that operates nationwide or worldwide, it is usual that several tens or several thousands of business sites are established throughout Japan or the world and a PC is provided for each employee while a printer is provided to a few of or several tens of employees working at each such business site.
Under such circumstances, the market has desired a method for preventing an information leakage by introducing the document management system as described above. The document management system may be installed at the headquarters or the home office and typically includes one document management server that stores and manages print data sent from a PC of each individual employee to the printer installed at the business site. In this case, an estimated amount of print data to be stored on the document management server is several hundreds of gigabytes per day (=several thousands of employees×several pages per day (several hundreds of kilobytes per page)).
An ordinary storage device may not be able to entirely and routinely store such a large amount of data. Accordingly, it may be necessary to introduce a storage device that can utilize storage modules provided in the document management system in a decentralized manner by using a network function. However, high costs may be required to manufacture such a storage device. Thus, an environment that can reduce the amount of data to be stored has been desired by the market.
Meanwhile, various types of application software such as general spreadsheet application software have been widely used, which, in the case of printing a plurality of copies of a print document, may send the same print data for each of the designated number of copies. Application software like this may not efficiently use and may even excessively consume data storage area of a document management server by causing the document management server to repeatedly store the same data over and over again.
While a need remains for improved document management, the installation of a server at each business site may be is generally avoided or even prohibited to reduce initial costs and operation costs that may possibly arise at each such site (i.e., a division-serverless system).
Moreover, in the case of using the document management server unit, a load in receiving data and storing the received data is may be very high. Accordingly, there remains a need in the market for a system or a method for reducing the amount of processing to a minimum. Japanese Patent Application Laid-Open No. 2006-081119 discusses one example of an improved method.
Furthermore, a conventional method under development collates a hash value for the entire print document and a hash value for each page to identify an updated page of the print document as a difference and stores only the updated page as print data together with print log information.
However, the above-described conventional method or the method under development may not be capable of sufficiently reducing operation costs (e.g., may not be able to sufficiently reduce the data storage area consumption amount) in the following case.
That is, in the case of printing a document in which only a specific field of a page (a “company name” field, for example) is different from that of another similar document, the document is registered as a different document or the page of the document including the different content is stored as an updated page on the document management server. Accordingly, in certain circumstances, the consumption of the capacity of storage area may not be adequately reduced.
Similar issues may also arise in a form printing system that dynamically inserts data at the time of printing into a predetermined template (form) that includes only a ruled line or a stationary graphic (a logo or the like).