This invention relates in general to automated processing of multiple items of data and, more particularly, to a method and apparatus for facilitating scalability during such automated data processing.
There are a variety of situations in which automated processing of a number of data items is desirable. One specific example of such an application is product catalogs. Product catalogs, whether in the form of a paper catalog or an Internet xe2x80x9cWebxe2x80x9d site, frequently have numerous pictures which each depict a respective one of the various items that are available for sale. Many years ago, these pictures were prepared using optical negatives and photographs. Currently, however, the trend is to maintain and process these pictures in the form of computer files containing digital images.
A given paper or on-line catalog will usually include products from a variety of different manufacturers, and it is common for each manufacturer to provide its own digital images. There will typically be variation between the form of images provided by different manufacturers, for example in terms of characteristics such as the size, shape, resolution, tint, and so forth. It is even possible that the images from a single given manufacturer may have different forms. Accordingly, in order for the images throughout a catalog to have a generally similar appearance, the various images from various sources need to be processed to adjust characteristics such as size, shape, resolution, and/or tint, so as to bring them into general conformity with each other.
A further consideration is that a manufacturer""s images do not represent a static situation, because manufacturers are constantly adding new products with new images, discontinuing existing products and associated images, and providing updated images for existing products. Moreover, there may be other reasons for adjusting images. For example, with respect to a paper or on-line catalog intended for use during the Christmas season, there may be a desire to put a festive frame around each image, such as a frame of holly leaves and berries. Moreover, stylistic changes in the images are often desirable.
The traditional approach for carrying out these various types of image processing tasks has involved manual adjustments effected on an image-by-image basis, through use of image processing software requiring extensive operator interaction. However, this is extremely time consuming and expensive. Many organizations currently employ a number of graphic artists to do this work, at great expense.
A less common approach has been the preparation of a hard-coded software routine to process images, written in line-by-line source code. However, these routines are time-consuming and expensive to generate, are likely to include errors or xe2x80x9cbugsxe2x80x9d, and have little flexibility because they cannot be modified quickly and cheaply. Moreover, they can only be prepared and executed by a skilled programmer, rather than by a graphic artist who is skilled in image processing but has limited computer skills. It is difficult to find persons who have both artistic and computer skills, and they command large salaries.
Thus, while these traditional approaches have been generally adequate for their intended purposes, they have not been satisfactory in all respects. In this regard, to the extent that preexisting approaches have involved automation, there has been no meaningful provision for scalability that would facilitate increased efficiency. For example, a given hard-coded software routine could be used with various different data sets, some of which may be larger than others. In the case of a relatively large data set, the processing time from start to completion can be significant, but there is no meaningful provision for avoiding this inefficiency. In fact, in a dedicated software routine which is hard-coded, it can be virtually cost prohibitive to add sophisticated techniques to increase efficiency.
Another related situation is where a given computer and hard-coded software routine are needed for processing several different sets of data at approximately the same point in time. One existing approach is to set priorities and to process each of the data sets in a sequential manner, but this means that the data sets which are given a lower priority may experience substantial delay before they are processed. As noted above, preexisting software routines have no meaningful capability to deal with such inefficiency.
From the foregoing, it may be appreciated that a need has arisen for a method and apparatus for facilitating scalability during automated data processing, in a manner which is efficient and cost-effective. According to the present invention, a method and apparatus are provided to address this need.
In particular, one form of the present invention involves providing a computer system having a plurality of processors, including first and second processors, and executing in the computer system on one of the processors thereof a first procedure which selectively launches execution in the computer system of respective project definitions in response to respective requests for execution thereof. Each of the project definitions includes: a plurality of function portions which each correspond to one of a plurality of predetermined function definitions that are different, and which each define at least one input port and at least one output port that are functionally related according to the corresponding function definition; a further portion which includes a source portion identifying a data source and defining an output port through which data from the data source can be produced, and which includes a destination portion identifying a data destination and defining an input port through which data can be supplied to the data destination; and binding information which includes binding portions that each associate a respective input port with one of the output ports. This form of the invention further involves: providing second and third procedures which can be respectively executed by the first and second processors and which can each effect execution of the project definitions; causing the first procedure to respond to a request for execution of a first of the project definitions by launching execution in the first processor of the first project definition by the second procedure; and causing the first procedure to respond to a request for execution of a second of the project definitions during execution of the first project definition in the first processor by evaluating at least one of the second project definition and a current operational characteristic of the first processor, by selecting one of the first and second processors in dependence on a result of the evaluation, and by then launching execution in the selected one of the first and second processors of the second project definition by a respective one of the second and third procedures.
Another form of the present invention involves: providing a computer system having a plurality of processors; executing in the computer system a first procedure which selectively launches execution by at least one second procedure in the computer system of project definitions in response to respective requests for execution thereof, each of the project definitions obtaining data from a data source, processing the data according to a project definition, and placing the data in a data destination; and causing the first procedure to respond to a request for execution of a selected one of the project definitions by evaluating a predetermined characteristic of the selected project definition, the first procedure being responsive to a determination that the characteristic does not exceed a threshold for launching execution in the computer system of one instance of the selected project definition to process all of the data from the corresponding data source, and being responsive to a determination that the characteristic exceeds the threshold for launching execution in the computer system of separate first and second instances of the selected project definition to respectively process mutually exclusive first and second portions of the data from the corresponding data source.