1. Field of the Invention
The present invention relates to a print management system formed by connecting an information processing apparatus that generates print data to a printing apparatus that receives the print data, a printing apparatus and information processing apparatus and a control method thereof, and a program.
2. Description of the Related Art
There exists a system which stores print data and print log information (e.g., user name) in association and manages them in order to track printed confidential information (Japanese Patent Laid-Open No. 08-147446). Upon detection of disclosure of confidential information, this system searches for print data similar to the disclosed words or image and browses print log information about print data with high similarity. This system is called a Job Archive System and will be abbreviated as a JA hereinafter.
This system includes a JA client unit (JA Agent) that runs on a printer and a JA server unit that runs on a normal PC or a server computer. The JA client unit and JA server unit are connected via a network.
The JA client unit intercepts print data, which the client PC has requested of the printer, before actual printing on paper and transmits the data and print log information to the JA server unit. The JA server unit segments the print data into pages and then segments each page into text regions and image regions, thereby generating search data of each region. The JA server unit integrates the print data of one page, text region information, image region information, text region search data, and image region search data in association to generate storage data of each page. The JA server unit also integrates the original print data and the storage data of each page in association to generate storage data of each print data and saves it in the storage device.
Nationwide or worldwide enterprises have several tens to several thousands of bases, including branch offices and business offices, where each worker holds his/her own PC, and one printer is installed for several to several tens of workers. There are market needs for such enterprises to prevent disclosure of confidential information by introducing a JA and storing print data from individual PCs to a base printer in a set of JA servers installed in the headquarters or head office. Print data to be stored in the JA server is estimated to be several hundred GB/day (=several thousand persons×several ten pages/day/person×several hundred KB/page).
Typical storage devices are usually not equipped to store such an enormous quantity of data. Usually, it is necessary to purchase a storage device capable of distributing storage modules by using a network function. However, such a device is very expensive.
In a multiple copy print mode, some software applications transmit identical print data in a number as large as the designated number of copies, and make the JA server store the same data multiple times. Storing the same data multiple times unnecessarily uses up storage space, and thus, is an inefficient manner in which to store data.
One method of solving the storage problem would be to install a base server on each base. This however, would be not be cost effective. Additionally, since the JA server unit is put under heavy load upon receiving and storing data, it is necessary to minimize new processes.