Remote assessment of distressed patients using telemedicine systems is likely to be a critical and pervasive component of future healthcare systems. Depending upon geography and economics, remote assessment is likely to occur over both wired and wireless networks and, thus, over a broad spectrum of transmission bandwidths ranging from very low to very high. Available transmission bandwidth can place a significant constraint on the overall compression ratio of the video, but for diagnostic purposes—i.e., for clinical or functional acceptability—it is essential that the compression process causes no tangible loss of detail and introduces no noticeable artifacts which could lead to misinterpretation. Accordingly, tele-medicine or tele-health systems that have the potential to deal with such bandwidth limitations while providing visually acceptable video will provide specialists with significantly more information for assessment, diagnosis, and management than will systems without that capability.
Region-of-interest (ROI) based video processing is useful for achieving optimal balance in the quality-bandwidth trade-off. Conventional ROI coding typically divides compressed video into two regions—the ROI and background (BKGRND) and assigns varying compression to each region. Thus, ROI video coding provides higher quality in the ROI, but poorer quality in the BKGRND, for a given total bit-rate (TBR). Further, in some systems, the BKGRND is dropped altogether. An extension of the ROI concept is the extended region-of-interest (EROI), which consists of an intermediate region between the ROI and BKGRND. The EROI enhances the elasticity of an ROI based video coding scheme by allowing for a more perceptually pleasing degradation in quality from the ROI.
When the total bit-rate (TBR) permits, the ROI can be coded with no loss in quality compared to the original. This is called mathematical losslessness (ML). A step down from ML is perceptual losslessness (PL). PL video coding incorporates human visual system (HVS) factors into a distortion measure to determine the optimum value of a parameter that satisfies a given threshold. When this threshold equals the just-noticeable-difference (JND), the result is PL encoded video (when compared with the original). Because of the many solutions for PL video coding, most commercial video coders incorporate different levels of empirical perceptual tuning.
However, in a medical application, the definition of PL must be different from the HVS based definition of PL. The latter is based on masking properties of the HVS, which may not be consistent with the details a medical expert wants to see in a video. That is, a HVS definition of PL applied to a video will add distortion to regions that the HVS considers as unimportant, but these might be critical diagnostic regions for the medical expert. Hence, the definition of PL referred to herein is considered to be a level of video that is visually perfect i.e., free of any artifacts.
Beyond the PL criterion for video coding is D-losslessness or F-losslessness. In medical or other visual applications, if an expert feels confident making an assessment with the coded video, the coding is said to be diagnostically lossless (DL) or functionally lossless (FL). Therefore, a determination of whether video is DL or FL also involves HVS factors, but in a domain context. Beyond DL or FL quality, a quality level that may contain artifacts but is not distracting or annoying is termed Best Effort (BE). The ML, PL, DL (in general, FL), and BE hierarchy explained above is general. However, depending on the application, PL may fall anywhere in the hierarchy. In other words, with PL video coding, the system may be under-designed or over-designed with respect to DL or FL.
In general, conventional ROI systems employ arbitrary choices for ROI and BKGRND compression ratios and thus are not human-centric. Hence, there is no guarantee of video quality, even within the ROI, and there is no opportunity for user interaction or input. Accordingly, based on the above, there presently exists a need in the art for a rate control method where, unlike conventional methods, bit allocation is shifted from the frame level to individual regions within the frame, with application domain requirements and network requirement being utilized to determine regional bit allocation.