The present exemplary embodiments relate generally to search optimization. They find particular application in conjunction with document processing and/or image processing, and will be described with particular reference thereto. However, it is to be appreciated that the present exemplary embodiments are also amenable to other like applications.
Many problems can be solved by formulating an objective function and then optimizing that function. Optimization generally includes searching a set of candidate solutions for one or more candidate solutions having best values of the objective function. Examples of problems that can be solved in this way include, but are not limited to, object localization, object detection, image categorization, invoice parsing, repeated structure finding, and the like.
Object localization generally seeks to find a known specific object, or any object of a known object category (e.g., a face or a car), and/or one or more individual parts of that object within an image. Object detection generally seeks to detect whether an image contains a known object. Image categorization generally seeks to assign category labels to images based on their content. Invoice parsing generally seeks to extract individual line items from an invoice. Repeated structure finding generally seeks to find instances of repeated structure (e.g., a set of fields that form one line item of an invoice or a set of features that form a face) given one or more instances of structure of interest.
Existing search methods, both general, such as A*, and specific, such as those for a particular model, are often not efficient enough to handle certain problems. This is common with relatively small problems that need to be solved quickly (e.g., in real time) and large and/or complex problems. To improve efficiency and reduce failures, several workarounds are available.
A typical workaround includes modifying a problem's objective function in such a way that optimization becomes easier. For example, a problem's objective function may be modified to assume independence between all or some of the variables involved. In this case, the problem might separate into several independent sub-problems, which allows for more efficient optimization. However, a drawback of such modifications and/or simplifications is that in many cases it compromises the quality of the solution. While this may be acceptable in certain situations, such as when the original problem cannot be solved at all using existing methods, it is not optimal.
Another typical workaround that can be used is artificially restricting the maximal allowable problem size to a limit that existing search algorithms can handle. For example, in a consumer application, a limit may be set on the maximal size of images that can be processed or on the maximal number of fields per invoice that can be located. However, these restrictions may be too limiting for end users.
The present disclosure contemplates new and improved systems and/or methods for remedying these, and other, problems.