1. Field of the Invention
The present invention relates to cloud-based systems and methods for automated analytics of inputs from remote, distributed devices for security surveillance.
2. Description of the Prior Art
It is known in the prior art to use mobile devices for security surveillance, as well as to analyze image and video content for surveillance purposes. While the prior art discloses individual aspects as the present invention, very few, if any, teach the ability to authenticate and analyze captured inputs from un-registered user-devices. The present invention permits remote servers to accept captured inputs from a variety of mobile devices, authenticate metadata from the inputs, and analyze the inputs to provide surveillance information.
The proliferation of wireless, mobile devices having image and video functions is widespread and use of these device-functions continues to increase. Sporting events, social gatherings, dissident events, and emergency situations are typically captured on a multitude of devices operated by differing users. Nowhere in the prior art is provided social surveillance or security system that allows for uploading of these captured inputs, authentication of such inputs, and cloud-based analysis of the inputs in order to provide real- or near real-time surveillance of a target environment. Prior art documents teach that camera and video input devices may be equipped with a time-stamp function that embeds a date and time into an image or video for later authentication. Also, it is known in the prior art to provide authentication of users and/or devices through the evaluation of uploaded content, including stenographic techniques such as digital fingerprinting and watermarking, or user-verification techniques such as login or CAPTCHA technologies and biometric scanning.
Notably, most of the prior art security surveillance systems disclose the use of fixed devices, rather than the use of mobile devices. For example, content-based analytics is widely used in CCTV settings and when verifying that digital content has been unaltered or authenticating a content's source (e.g., copyrighted music, images and videos). Additionally, similar technology has been deployed in military and law enforcement units, although these technologies typically require specialized pre-registered devices, as opposed to incorporating distributed, unknown devices.
It is known in the prior art that a video surveillance system can be set up at a location with a local recorder and server besides cameras. In recent years, with development of cloud computing and communication technologies, there is a need for users to have access to their surveillance systems anywhere anytime with their smart mobile devices. Meanwhile, users need not only basic recording from their surveillance systems, but also want to get more advanced preventive and proactive analytics from their surveillance systems.
Video surveillance systems typically rely on 2-Dimensional (2D) images and/or videos. If high-definition 3D images and/or videos can be generated for surveillance, the security surveillance system could harvest much better information. Camera manufactures have developed 3D cameras in order to produce 3D videos. However, the prices are much higher than those of regular 2-Dimensional (2D) cameras. For the existing surveillance systems with 2D cameras, it is a huge expense to update to 3D cameras in order to get 3D surveillance.
Thus there is a need for a cloud-based analytics platform, which not only provides users access anyway anytime via a network-connected device, but also generating 3D images and/or videos based on regular 2D input data from cameras, especially from mobile devices, and providing 3D analytics.
By way of example, prior art documents include:
U.S. Pat. No. 7,259,778 for “Method and apparatus for placing sensors using 3D models” by inventor Aydin Arpa et al. filed Feb. 13, 2004, describes method and apparatus for dynamically placing sensors in a 3D model is provided. Specifically, in one embodiment, the method selects a 3D model and a sensor for placement into the 3D model. The method renders the sensor and the 3D model in accordance with sensor parameters associated with the sensor and parameters desired by a user. In addition, the method determines whether an occlusion to the sensor is present.
U.S. Pat. No. 7,675,520 for “System, method and computer program for creating two dimensional (2D) or three dimensional (3D) computer animation from video” by inventor Will Gee et al. filed Dec. 7, 2006, describes System, method and computer program for creating two dimensional (2D) or three dimensional (3D) computer animation from video. In an exemplary embodiment of the present invention a system, method and computer program product for creating at least a two dimensional or three dimensional (3D) datastream from a video with moving objects is disclosed. In an exemplary embodiment of the present invention, a method of creating animated objects in 2D or 3D from video, may include: receiving video information which may include a plurality of frames of digital video; receiving and adding metadata to the video information, the metadata relating to at least one object in motion in the digital video; and interpreting the metadata and the video information and generating a datastream in at least 2D. In an exemplary embodiment, 2D, 3D or more dimensional data may be used to provide an animation of the event of which the video was made. In an exemplary embodiment, a 2D or 3D gametracker, or play reviewer may be provided allowing animation of motion events captured in the video.
U.S. Pat. No. 7,944,454 for “System and method for user monitoring interface of 3-D video streams from multiple cameras” by inventor Hanning Zhou, et al. filed Sep. 7, 2005, describes a user navigation interface that allows a user to monitor/navigate video streams captured from multiple cameras. It integrates video streams from multiple cameras with the semantic layout into a 3-D immersive environment and renders the video streams in multiple displays on a user navigation interface. It conveys the spatial distribution of the cameras as well as their fields of view and allows a user to navigate freely or switch among preset views. This description is not intended to be a complete description of, or limit the scope of, the invention. Other features, aspects, and objects of the invention can be obtained from a review of the specification, the figures, and the claims.
U.S. Pat. No. 8,284,254 for “Methods and apparatus for a wide area coordinated surveillance system” by John Frederick Romanowich, et al. filed Aug. 11, 2005, describes a coordinated surveillance system. The coordinated surveillance system uses a larger number of fixed low resolution detection smart camera devices and a smaller number of pan/tilt/zoom controllable high resolution tracking smart camera devices. The set of detection cameras provide overall continuous coverage of the surveillance region, while the tracking cameras provide localized high resolution on demand. Each monitor camera device performs initial detection and determines approximate GPS location of a moving target in its field of view. A control system coordinates detection and tracking camera operation. A selected tracking camera is controlled to focus in on, confirm detection, and track a target. Based on a verified detection, a guard station is alerted and compressed camera video is forwarded to the guard station from the camera(s). The guard station can direct a patrol guard to the target using GPS coordinates and a site map.
U.S. Pat. No. 8,721,197 for “Image device, surveillance camera, and mask method of camera screen” by inventor Hiroyuki Miyahara, et al. filed Aug. 10, 2012, describes a microcomputer. In a microcomputer included in an image device, a mask 2D 3D converting section expresses coordinates of a 2-dimensional image plane defined by an imaging element having a rectangular contour in a 3-dimensional coordinate system. The image plane is positioned in the state that a focal length corresponding to a zoom position is adopted as a Z coordinate value of the image plane in the 3-dimensional coordinate system. A mask display position calculating section 165 calculates a 2-dimensional position of a mask on a camera screen by utilizing a similarity of the size of the image plane and the size of the camera screen when a position of a mask on the image plane in the 3-dimensional coordinate system after PAN, TILT rotations and a zooming is converted into the 2-dimensional position of the mask on the camera screen.
U.S. Publication 2013/0141543 for “Intelligent image surveillance system using network camera and method therefor” by inventor Sung Hoon Choi, et al. filed May 23, 2012, describes an intelligent control system. The intelligent control system according to an exemplary embodiment of the present disclosure includes a plurality of network cameras to photograph a surveillance area; an image gate unit to perform image processing of image data, which is input from the plurality of network cameras, according to a specification that is requested by a user; a smart image providing unit to convert a plurality of image streams, which are image processed by the image gate unit, to a single image stream; and an image display unit to generate a three-dimensional (3D) image by segmenting, into a plurality of images, the single image stream that is input from the smart image providing unit and by disposing the segmented images on corresponding positions on a 3D modeling.
U.S. Publication 2014/0192159 for “Camera registration and video integration in 3d geometry model” by inventor Henry Chen, et al. filed Jun. 14, 2011, describes apparatus, systems, and methods to receive a real image or real images of a coverage area of a surveillance camera. Building Information Model (BIM) data associated with the coverage area may be received. A virtual image may be generated using the BIM data. The virtual image may include at least one three-dimensional (3-D) graphics that substantially corresponds to the real image. The virtual image may be mapped with the real image. Then, the surveillance camera may be registered in a BIM coordination system using an outcome of the mapping.
U.S. Publication 2014/0333615 for “Method For Reconstructing 3D Scenes From 2D Images” by inventor Srikumar Ramalingam, et al. filed May 11, 2013, describes a method reconstructing at three-dimensional (3D) real-world scene from a single two-dimensional (2D) image by identifying junctions satisfying geometric constraint of the scene based on intersecting lines, vanishing points, and vanishing lines that are orthogonal to each other. Possible layouts of the scene are generated by sampling the 2D image according to the junctions. Then, an energy function is maximized to select an optimal layout from the possible layouts. The energy function use's a conditional random field (CRF) model to evaluate the possible layouts.
U.S. Pat. No. 8,559,914 for “Interactive personal surveillance and security (IPSS) system” by inventor Jones filed Jan. 16, 2009, describes an interactive personal surveillance and security (IPSS) system for users carrying wireless communication devices. The system allows users carrying these devices to automatically capture surveillance information, have the information sent to one or more automated and remotely located surveillance (RLS) systems, and establish interactivity for the verification of determining secure or dangerous environments, encounters, logging events, or other encounters or observations. This IPSS is describes to enhance security and surveillance by determining a user's activities, including (a.) the user travel method (car, bus, motorcycle, bike, snow skiing, skate boarding, etc.); (b.) the user motion (walking, running, climbing, falling, standing, lying down, etc.); and (c.) the user location and the time of day or time allowance of an activity. When user submits uploaded (or directly sent) surveillance information to the public server, the surveillance videos, images and/or audio includes at least one or more of these searchable areas, location, address, date and time, event name or category, and/or name describing video.
U.S. Pat. No. 8,311,983 for “Correlated media for distributed sources” by inventor Guzik filed Dec. 14, 2009 (related to U.S. Publications 2010/0274816, 2011/0018998, 2013/0027552 and 2013/0039542) discloses method embodiments associating an identifier along with correlating metadata such as date/timestamp and location. The identifier may then be used to associate data assets that are related to a particular incident. The identifier may be used as a group identifier on a web service or equivalent to promote sharing of related data assets. Additional metadata may be provided along with commentary and annotations. The data assets may be further edited and post processed. Correlation can be based on multiple metadata values. For example, multiple still photos might be stored not only with date/time stamp metadata, but also with location metadata, possibly from a global positioning satellite (GPS) stamp. A software tool that collects all stored still photos taken within a window of time, for example during a security or police response to a crime incident, and close to the scene of a crime, may combine the photos of the incident into a sequence of pictures with which for investigation purposes. Here the correlation is both by time and location, and the presentation is a non-composite simultaneous display of different data assets. Correlating metadata can be based on a set of custom fields. For example, a set of video clips may be tagged with an incident name. Consider three field police officers each in a different city and in a different time zone, recording videos and taking pictures at exactly at midnight on New Year's Day 2013. As a default, a group may be identified to include all users with data files with the same Event ID. A group may also be either a predefined or a self-selecting group, for example a set belonging to a security agency, or a set of all police officers belonging to the homicide division, or even a set of officers seeking to share data regardless of if they are bellowing to an organized or unorganized group.
U.S. Pat. No. 7,379,879 for “Incident reporting system and method” by inventor Sloo filed Feb. 26, 1999, describes a computer-based method of collecting and processing incident reports received from witnesses who observe incidents such as criminal acts and legal violations. The method automates the collection and processing of the incident reports and automatically sends the incident reports to the appropriate authority so that the observed incidents can be acted on in an appropriate manner. For example, a witness may be equipped with a video input system such as a personal surveillance camera and a display. When the witness encounters an incident such as a suspect committing a crime, the video input system would automatically recognize the suspect from the video input and could then display records for the suspect on the witness's hand held readout without revealing the suspect's identity. The witness would not need to know the identity of the suspect to observe the incident relating to the suspect. Such a system may overcome some of the problems associated with publicly revealing personal data.
U.S. Publication 2009/0087161 for “Synthesizing a presentation of a multimedia event” by inventors Roberts, et al. filed Sep. 26, 2008, discloses a media synchronization system includes a media ingestion module to access a plurality of media clips received from a plurality of client devices, a media analysis module to determine a temporal relation between a first media clip from the plurality of media clips and a second media clip from the plurality of media clips, and a content creation module to align the first media clip and the second media clip based on the temporal relation, and to combine the first media clip and the second media clip to generate the presentation. Each user who submits content may be assigned an identity (ID). Users may upload their movie clips to an ID assignment server, attaching metadata to the clips as they upload them, or later as desired. This metadata may, for example, include the following: Event Name, Subject, Location, Date, Timestamp, Camera ID, and Settings. In some example embodiments, additional processing may be applied as well (e.g., by the recognition server and/or the content analysis sub-module). Examples of such additional processing may include, but are not limited to, the following: Face, instrument, or other image or sound recognition; Image analysis for bulk features like brightness, contrast, color histogram, motion level, edge level, sharpness, etc.; Measurement of (and possible compensation for) camera motion and shake.
U.S. Publication 2012/0282884 for “System and method for the emergency voice and image e-mail transmitter device” by inventor Sun filed May 5, 2011, describes a voice and image e-mail transmitter device with an external camera attachment that is designed for emergency and surveillance purposes is disclosed. The device converts voice signals and photo images into digital format, which are transmitted to the nearest voice-image message receiving station from where the digital signal strings are parsed and converted into voice, image, or video message files which are attached to an e-mail and delivered to user pre-defined destination e-mail addresses and a 911 rescue team. The e-mail also includes the caller's voice and personal information, photo images of a security threat, device serial number, and a GPS location map of the caller's location. When the PSU device is initially used, the user needs to pre-register personal information and whenever a digital signal string is transmitted out from the PSU device it will include these personal information data plus a time code of the message being sent, the PSU device's unique serial number, and the GPS generated location code, etc. which will all be imbedded in the PSU e-mail.
U.S. Publication 2012/0262576 for “Method and system for a network of multiple live video sources” by inventors Sechrist, et al. filed Mar. 15, 2012, discloses a system and a method that operate a network of multiple live video sources. In one embodiment, the system includes (i) a device server for communicating with one or more of the video sources each providing a video stream; (ii) an application server to allow controlled access of the network by qualified web clients; and (iii) a streaming server which, under direction of the application server, routes the video streams from the one or more video sources to the qualified web clients.
Geo-location information and contemporaneous timestamps may be embedded in the video stream together with a signature of the encoder, providing a mechanism for self-authentication of the video stream. A signature that is difficult to falsify (e.g., digitally signed using an identification code embedded in the hardware of the encoder) provides assurance of the trustworthiness of the geo-location information and timestamps, thereby establishing reliable time and space records for the recorded events. In general, data included in the database may be roughly classified into three categories: (i) automatically collected data; (ii) curated data; and (iii) derivative data. Automatically collected data includes, for example, such data as reading from environmental sensors and system operating parameters, which are collected as a matter of course automatically. Curated data are data that are collected from examination of the automatically collected data or from other sources and include, for example, content-based categorization of the video streams. For example, detection of a significant amount of motion at speeds typical of automobiles may suggest that the content is “traffic.” Derivative data includes any data resulting from analysis of the automatically collected data, the curated data, or any combination of such data. For example, the database may maintain a ranking of video source based on viewership or a surge in viewership over recent time period. Derivative data may be generated automatically or upon demand.
None of the prior art provides solutions for cloud-based 3D analytics for a target surveillance area as provided by the present invention.