The digital camera, smart phone, and tablet revolution has greatly increased the prevalence of document sharing. Also increasing is the use of built-in “native” image sensing capabilities (e.g., camera function, video function, etc.) of these devices that can be used to capture an image of a target object (e.g., document) to be shared. Certain object capture operations convert physical content (e.g., paper documents lying on the surface of a desk) to electronically stored objects (e.g., “bitmaps” or “pictures” of the document) that can be uploaded for various purposes (e.g., analysis, storing, sharing, etc.). In many cases, native image processing application programming interfaces (APIs) might be supported by the developer of the operating system of the capture device. Such APIs can be accessed by applications (or “apps”) developed by a third-party provider such as a shared content storage service provider. The native APIs facilitate rapid development of various apps such as for document capture and sharing. As an example native API, certain native image processing features that are available from an operating system developer or third-party provider might provide the coordinates of all detected polygons in a particular still image or video frame in response to an API call. This information might then be used to determine the boundary of a target document that the user desires to capture and share.
Unfortunately, relying merely on the native image processing capabilities of a smart phone or digital camera or other digital image capture device can present challenges when determine the boundary of a target document that the user desires to capture and share. Image processing capabilities of the foregoing devices and/or the image processing capabilities of the software that runs on these devices are often deficient, at least in that the native capabilities do not provide enough information about the particular still image or video frame. For example, some image processing approaches merely return a set of polygons that are detected by the native hardware and/or native software. In certain situations that introduce hand or camera movements, or, in the presence of inherent imaging variations (e.g., due to refocusing, lighting or shadow changes, etc.) the set of polygons between each captured frame can change substantially, resulting in instabilities through successive sets of polygons at each frame. Such instabilities can occur, for example, at 30 frames per second (fps), thereby resulting in a temporal inter-frame uncertainty associated with the boundary of the intended target document.
When such an uncertain target document boundary is drawn on the capture screen of the user device, a jitter characteristic is often visually observable, resulting in an annoyance to the user. In some cases, the jitter can result in a boundary that moves (i.e., jitters) between the boundary of the target document and another area of the capture screen. In other cases, the jitter can result in a boundary that does not stably reflect the correct boundary of the intended target document. As an example of this, the largest detected polygon might correctly define or approximate the boundary of the target document in one frame, while in another frame (e.g., a next frame), the defined or approximated boundary might be formed from a polygon comprising three edges of the target document in combination with some other edge that is present in the frame (e.g., another document, a shadow, etc.). In such cases, both a visible jitter and an inaccurate target document boundary might result in undesirable effects that detract from the user's experience with the app.
What is needed is a technological solution for reducing the uncertainty in determining the boundary of a target capture object so as to reduce or eliminate jitter and/or boundary inaccuracy. Some of the approaches described in this background section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.