This disclosure relates to a method and apparatus for authenticating video content that may be intentionally altered during transmission in order to accommodate a variety of access devices, network architectures, and communication protocols. In various embodiments, a transmitting node, a receiving node, or both implement such a process. Various embodiments of the process use video fingerprints and cryptographic digital signatures of video content to authenticate the video content by separately verifying the corresponding video fingerprint and video content. The various embodiments of the process for authenticating video content disclosed herein tolerate a predetermined measure of loss in the video content at the receiving node. For example, the methods and apparatus described herein permits wide access of real time video from mobile and fixed cameras to government safety organizations, military organizations, news organizations, and the general public.
Public surveillance cameras are installed for the purpose of security and safety in such diverse places as roads, city sidewalks, airports, subways, military bases, schools, and stores. As recently as ten years ago, these video feeds were private, only viewable by a single entity such as the police, military, or private security company. However, it is increasingly common that public surveillance video is sent in the clear to enable use by multiple security entities (e.g., police, fire, ambulance, homeland security, etc.) and to enable public access for various uses (e.g., for crowd-sourcing the security task, obtaining information on traffic congestion, etc.). In-the-clear video content is not encrypted to enable open access or at least wider access than would be practical for encryption. Thus, there is a need for content authentication to defend against malicious attacks that include source data modification and man-in-the-middle modification. For example, an attacker may intercept video streams and may remove incriminating evidence by reordering frames or injecting new ones of pre-recorded video. Authentication ensures that video content received at a receiver (i.e., recipient) end (e.g., security control station) is the same as the original video content captured at a video camera or supplied by another source at the sender end. For example, this is pertinent to the security of LTE mobile video which could be used for public safety and first responder communications.
There are a number of solutions to video content authentication. Generally speaking, they can be classified into three categories: 1) symmetric encryption, 2) digital signatures using asymmetric encryption, and 3) watermarking. However, none of these existing solutions are sufficient for the needs today to authenticate video content across a wide range of recipients where a wide range of devices are used on both the source and receiver (i.e., recipient) ends of video communications.
Symmetric encryption is not sufficient because it requires that many different security agencies have to distribute and share a single decryption key. In security, this is known as the key management problem. Distributing too many keys inevitably reduces system security. More specifically, symmetric encryption includes fully layered encryption and selective or permutation-based encryption. In fully layered encryption, video content is compressed and then encrypted. This approach usually results in heavy computation and slow speed, which makes it unsuitable for real-time video authentication. Selective and permutation-based encryption selectively encrypts bytes or uses permutation to scramble video content. This type of approach is typically designed for specific video formats, such as H.264 or MPEG. For instance, in MPEG, symmetric encryption is used to select and permute bytes based on relationships among I-frames, P-frames, and B-frames. In general, this approach is not format compliant.
Digital signatures that use asymmetric encryption are commonly used cryptographic methods that are very secure for authenticating data. However, due to the nature of cryptographic calculations, this requires that the received data be identical to the source data; otherwise it will not authenticate. The problem with video transmission—especially over wireless channels—is that the original content may be altered due to noise in the channel or to resize the video due to device capabilities (e.g., to the smaller screen of a mobile device). Therefore, even though the data may not be maliciously altered, the received data may not be exactly the same as the original—in which case it will erroneously not authenticate (i.e., false rejection).
Asymmetric encryption and digital signatures can be obtained by applying Haar wavelet filters, discrete cosine transforms (DCTs), or wavelet transforms on frames and then generating hash values based on the obtained parameters. An example of an off-the-shelf camera that implements cryptographic security is the Cisco Video Surveillance 2500 Series IP Camera from Cisco Systems, Inc. of San Jose, Calif. This includes hardware-based asymmetric encryption using advanced encryption standard (AES).
A variant of the asymmetric encryption and digital signatures solution is based on a cryptographic checksum, which provides a digitally signed checksum of whole frames, periodic frames, packets, or periodic packets. The cryptographic checksum solution provides modification detection and message integrity checking. It is able to handle the case of video packet loss during transmission. However, for the cases that the video is purposefully altered, for example, for size-reduction or transcoding in the case of a 4G mobile or for HTTP adaptive bitrate streaming, the crypto-checksum will not match an altered video unless the checksum is reapplied at each modifying node. This is possible in a proprietary network, however this is non-standard and would entail fairly complex—and potentially insecure—key management to distribute and securely maintain the encryption key(s) at all the nodes.
Watermarking can avoid the problems with symmetric and asymmetric encryption and thus is a valid solution to the current problem. However, watermarking has its own disadvantages. Since a watermark is embedded into the original video, it necessarily alters that video. The tradeoff for watermarks is imperceptibility of the embedded watermark versus the ability to extract the watermark from the video to perform authentication. In the current problem, it is undesirable to alter the video and desirable to maximize the success of authentication. Under these circumstances, it is undesirable to embed a watermark in the video. Digital watermarking embeds information into video frames to verify authenticity. Watermarking techniques exist for both uncompressed and compressed video (e.g., H.264).
Based on the foregoing, it is desirable that a process for authenticating video content allows access to a variety of persons using a variety of user devices across a variety of network architectures and communication protocols while also being able to detect when video content is unexpectedly altered, covertly altered, or altered with deceptive intent. In order to permit such wide access, the process must be able to tolerate video content that has legitimately and expectedly been altered during transmission.