Ultra-high definition (UHD) image sensors, which have a large image format and small pixel pitch, are becoming commonly available for use in numerous new products and applications. However, conventional video architectures generally do not support bandwidth and timing requirements of UHD sensors. New video architectures that support the bandwidth and timing requirements of UHD sensors have been developed; however, these new video architectures are generally developed from scratch for particular uses without taking advantage of previously available hardware.
Improvements in UHD sensor technologies vastly exceed bandwidth and transport capabilities of many existing video transport architectures. An extensive infrastructure of existing video hardware that is designed and configured for transporting high definition (HD) video is deployed and installed in equipment throughout the world. This infrastructure generally does not support transport of video data from the UHD video cameras to a display or end-user.
Existing HD video architectures are generally configured for processing streams of video data that conform to one or more standard formats such as the Society of Motion Picture and Television Engineers (SMPTE) standards SMPTE 292M and SMPTE 424M, for example. These standards include a 720p high definition (HDTV) format, in which video data is formatted in frames having 720 horizontal data paths and an aspect ratio of 16:9. The SMPTE 292M standard includes a 720p format which has a resolution of 1280×720 pixels, for example.
A common transmission format for HD video data is 720p60, in which the video data in 720p format is transmitted at 60 frames per second. The SMPTE 424M standard includes a 1080p60 transmission format in which data in 1080p format is transmitted at 60 frames per second. The video data in 1080p format is sometimes referred to as “full HD” and has a resolution of 1920×1080 pixels.
A large number of currently deployed image detection systems are built in conformance with HD video standards, such as the commonly used 720p standard. The 1280×720 pixel frames of a 720p standard system include about 1.5 megapixels per frame. In contrast, UHD image sensors generally output image frames in 5 k×5 k format, which have about 25 million pixels per frame. Therefore, the 1280×720 pixels used in a 720p standard system are not nearly enough to transport the much larger number of pixels generated by an UHD image sensor.
UHD sensors are conventionally used with video architectures that are designed particularly for transporting UHD video data. These new video architectures generally leverage video compression techniques to support UHD bandwidth and timing requirements. Some video architectures that are currently used for transporting UHD video data use parallel encoders or codecs and data compression to transport the UHD video. However, the use of compression makes these video architectures unsuitable for end users who rely on receiving raw sensor data.
The use of legacy hardware for transporting UHD video from next generation cameras is problematic because the legacy hardware generally does not provide sufficient bandwidth. Moreover, replacing existing video architectures with new architectures for transporting UHD video data can be impractical and/or prohibitively expensive for users who have already implemented a large amount of conventional video processing equipment.
Various spatial and temporal video compression techniques have been used to process image data from UHD image sensors for transport over existing HD video architectures. The UHD video data is commonly compressed using compression algorithms that retain enough of the UHD video data to generate visible images and video streams for human viewing, but lose or discard data from the UHD image sensors that may not be needed for human viewable images and video streams.
Other conventional techniques for processing data from UHD sensors generally involve the use of new or proprietary video architectures that have been developed for particular applications of the UHD sensors. These techniques are costly and inefficient because they do not take advantage of widely available HD video architectures that have been deployed throughout the world.
Transporting UHD image data on existing equipment generally involves splitting up the image data into multiple packets or sub-frames. Sorting separate video path packets and stitching together panoramic scenes from multiple frames generally adds processing steps that can prevent real-time display of the image data.
Previous systems and methods for stitching panoramic scenes with multiple frames have involved scene registration and image processing to blend the overlapping image data for stitching together panoramic scenes from multiple frames. Other previously known techniques for stitching panoramic scenes with multiple frames have involved inertial based image registration and geo-location based image registration techniques to fuse imagery together, in which one type of image data is fused on top of the other type of image data.
Various other methods for stitching together panoramic scenes from multiple frames have been performed based on feature registration. These methods generally involve substantial post-processing of image data which increases increase latency. Feature registration techniques are not suitable for stitching together panoramic images of scenes that are not feature rich. Also, disadvantageously, many of the existing scene based registration schemes cannot accurately stitch different/multiple spectral bands.
In many UHD imaging applications, it would be desirable to provide raw data from the UHD sensors to analysts or other users. Other consumers of UHD video data require highly accurate and time aligned symbology overlaid with image data to meet mission requirements. However, inserting symbology into the video stream prior to transport replaces raw image data and destroys certain information that could be useful to an analyst or other consumer of the raw data. Previous systems and methods involve archival tagging and the use of time aligned metadata that does not allow presentation of symbology near real time. However, method that rely on time aligned metadata have been problematic due to asynchronous video and data pipelines, for example.