Display interfaces for conventional video media (e.g., broadcast television) predominantly rely on “panel-based” overlay technology or picture-in-picture (PiP) technology to allow a viewer of the video media to interact with elements on the display screen. For example, a viewer may press a button on a remote control that causes an overlay panel to be presented on the display screen while the video media continues to play in the background. In this scenario, the overlay panel may present an electronic programming guide (EPG), television settings, or other similar information to the viewer. PiP-based interfaces place the video media in a small viewport that is typically positioned near a periphery of the display, and is overlaid or composited on top of another video feed or on another type of user interface. In either scenario, the user interface is modal, meaning that the viewer can choose to be in either a video media viewing mode (with the video media presented in full screen), or a non-viewing mode (with the video media presented in the background or in a PiP viewport).