In some computer vision contexts, a device compresses raw video to generate a compressed video stream and the video stream is transmitted to another device or cloud computing environment, which decompresses the video stream and performs computer vision such as object detection and/or recognition on the decompressed video. Due to network bandwidth constraints, to reduce aggregate bandwidth from large numbers of video streams, and other concerns, the video is often compressed to a low bitrate video stream. As a result, the decompressed video includes artifacts (due to quantization of DCT coefficients, etc.) that distort image features. During object detection and/or recognition, the distorted image features may result in lower detection scores even at locations of object occurrence due, in part, to the object detection models being trained using original high quality images. Such distortion may thereby result in false negatives (undetected or missed detections) and/or false positives (spurious detections) during implementation.
It may be advantageous to improve computer vision (e.g., object detection and/or recognition) in the context of low bitrate video streams. It is with respect to these and other considerations that the present improvements have been needed. Such improvements may become critical as the desire to implement computer vision becomes more widespread.