Imaging devices such as video cameras may capture and record still or moving images in digital computer-based files that may be stored in one or more hard drives, servers or other non-transitory computer-readable media. For example, video cameras are frequently provided in financial settings such as banks or casinos, where money changes hands in large amounts or at high rates of speed, or within warehouses, fulfillment centers or other like facilities, where consumer goods commonly arrive or depart in containers of various sizes or shapes, as well as in locations such as airports, stadiums or other dense environments, where the travels of persons or objects, or the flow of traffic on one or more sidewalks, roadways or highways may be observed.
While files which include such imaging data may be individually captured and stored with relative ease, where a large number of cameras are provided in order to monitor various aspects of a particular space, location or facility, the amount of digital storage capacity and computer processing power that is required in order to centrally analyze, index and store such files for any relevant purpose may be overwhelming. Where a facility such as a warehouse or an airport provides a large array of digital cameras for surveillance or monitoring operations, such cameras may capture and store over a petabyte (or a million gigabytes) of video data from such cameras each day.
For example, many high-level computer vision algorithms, including but not limited to machine learning algorithms provided for reading bar codes, recognizing characters or detecting activities, or image processing algorithms for transforming, combining, measuring or converting images, may consume substantial amounts of computer resources in order to properly evaluate imaging data for their intended purposes. Such vision algorithms, or vision systems employing such algorithms, often require large numbers of files and extensive amounts of visual data, e.g., streams of videos or images, to be transferred to one or more sites or resources for temporary storage or until the evaluation of the imaging data is complete. Singularly or collectively, such factors may complicate the execution of computer vision and/or machine learning algorithms, and render such algorithms expensive to operate in terms of the network bandwidth, storage capacity or processing power that each algorithm may require.
The performance and success of individual computer vision and machine learning algorithms, especially such algorithms that may be operated in large-scale systems with many individual imaging devices having independent fields of view, may vary considerably due to the limited size and variability of the data sets that are used to train such algorithms, as well as a lack of dimensionality within the algorithms, and the nature and quality of the incoming imaging data to be processed. Presently, the processing of imaging data according to such complex algorithms often fails, frequently or abruptly, and in inexplicable or unpredictable ways. The reliability of such computer vision or machine learning algorithms, and the computer systems on which such algorithms operate, pose a serious problem in large-scale applications in which imaging data is processed in multiple stages. Accordingly, a failure of one or more intermediate steps of an algorithm, or of a computer system during the performance of such steps, may subvert an entire process, thereby rendering the use and consumption of computer resources (e.g., resources consumed in order to transfer, store or process imaging data) prior to such failure useless.
For example, algorithms that are configured to recognize events or activities within surveillance footage may commonly fail at later stages, in which a lack of contrasting or distinct visual features within the footage may negatively affect the reliable classification of events or activities performed by or associated with one or more people or objects recognized therein. By a time when one or more of such algorithms is identified as having failed, however, a substantial amount of computer resources may have already been expended in transferring, temporarily or permanently storing, or subsequently processing the footage to detect one or more aspects of motion or to recognize one or more humans or objects therein.
When conducting surveillance or monitoring operations, video cameras may be aligned and configured to capture imaging data including still or moving images of objects, actions or events within their respective fields of view, and information regarding the captured imaging data or the observed objects, actions or events may be recorded and subjected to further analysis in order to identify aspects, elements or features of the content expressed therein. Such video cameras may be provided alone or in groups, and programmed to recognize when an action or event has occurred, such as when a frame-to-frame analysis of video imagery suggests that a predetermined threshold has been exceeded or that a predetermined condition has been satisfied, or otherwise implies that the action or the event has occurred based on information or data captured by the video cameras. Moreover, information and data captured by such video cameras may be archived in one or more data stores, where the information or data may be further analyzed at a later time, or used or recalled for any purpose.