This description relates to regions of interest in video frames.
The display capabilities (screen size, density of pixels, and color depth, for example) of devices (ranging from big-screen televisions to flip phones) used to present video vary widely. The variation can affect the viewer's ability to read, for example, a ticker of stock prices or sports scores at the bottom of a news video, which may be legible on a television but blurred or too small to read on a cell phone or personal digital assistant (PDA).
As shown in FIG. 1A, video content for television feeds is produced in formats, such as NTSC, PAL, and HD, that are suitable for television displays 102 (in the figure, the frame is shown in its native resolution) that are, for example, larger than 15 inches. As shown in FIG. 1B, by contrast, a display screen 104 of a handheld device, for example, is often smaller (on the order of 1.5 inches to 3 inches) and has a lower pixel resolution which makes the video frame, especially the text, less legible.
As shown in FIGS. 2A and 2B, certain kinds of content is especially illegible on small screens, such as information that is supplemental to the main video content, e.g.: sports statistics that may be inserted, for example: in a floating rectangle 200 that is superimposed or alpha-blended over a video feed of a sports game; fine print 204, e.g., movie credits or a disclaimer at the end of a commercial (as shown in FIG. 2B); a ticker 202 of characters that moves across the screen to display, for example, news or stock prices; or phone number or URL to contact for more information, for example, in a commercial (not shown).
As shown in FIG. 2C, in the case of broadcasting digital video to handheld devices 212, the frames of the video are encoded and packaged in a video stream 214 by hardware and/or software at a network head-end 216 and delivered over a limited-bandwidth broadcast channel 218 to a population of the handheld devices (such as cell phones, PDAs, wrist watches, portable game consoles, or portable media players).
Handheld and other devices (and the limited-bandwidth channel) impose limitations on the ability of a viewer to perceive content in the video frames as it was originally intended to be perceived.
The small size of the handheld display makes it hard to discern details in the frame simply because it is hard for the eye (particularly of an older or visually impaired person) to resolve detail in small areas. This limitation arises from the smallness of the area being viewed, regardless of the resolution of the display, and would exist even for a high (or even infinite resolution) display, as illustrated in FIG. 3A, which shows a frame on a small, high resolution display 302.
In addition, detail in a frame becomes blurry when the original pixel data of the frame is re-sampled at a lower pixel density for use on the lower resolution display. As a result of re-sampling, the viewer simply cannot see detail as well on, e.g., a stock ticker that is displayed at a quarter of the resolution of the original frame (half of the resolution in each dimension) as is typical for video broadcast to mobile devices using emerging mobile broadcast technologies like DVB-H and FLO. The blurriness would exist even if the handheld's display were large, as an enlarged low-resolution image 310 still lacks detail, as illustrated in FIG. 3B.
Solutions have been proposed for the limitations of small screens (a term that we sometimes use interchangeably with “displays”).
Supplemental (externally-connected) displays, holographic displays, and eye-mounted displays give the effect of a large-screen, high-resolution display in a small, portable package. For example, as shown in FIG. 4A, a virtual display 402, visible to a wearer of glasses 400, appears to be a full-sized display due to its proximity to the wearer's eye (not shown). Broadcasters that have a limited-capacity network (e.g. 6 Mb/s total capacity is a common capacity) avoid devoting a large amount of bandwidth to provide high-resolution video for relatively few viewers who own a high-resolution-capable display device. Instead of providing 3 or 4 high-resolution channels, broadcasters prefer to broadcast 20 or 30 lower-resolution channels. A high-resolution display is unable to provide high-resolution for the viewer if the frames that are received are of low resolution, and effectively becomes a low-resolution display.
As shown in FIG. 4B, on a handheld that has pan-and-zoom capability, a user may zoom in on a part 422 of a region of interest 420 to produce an enlarged image 424. If region 420, once enlarged, is larger than a viewable area of the display, the user can pan to view the entire region. Because each frame carries no hidden or embedded latent information, expanding a region 420 of a low-resolution image produces a blurry image (e.g. image 424).