Traditional forms of media content such as books, video, audio and the like provide users with a narrowly tailored experience. Books generally consist of text and images, audio is limited to sounds and videos are limited to audio-visual experiences. The content and the user's experience lacks dimension, in that, the various types of different content, such as text and video are not easily combined. To the extent that content can be combined, currently, it is not selectively combined and provided to the user in a manner that is specifically tailored to the user (e.g., the content is not combined and played as a function of the user's actual consumption of the content) without active user input (e.g., without mouse clicks or similar mechanical inputs) through personal computing devices that are portable and convenient to use.
The digital age has allowed for text and multi-media content types to be more easily combined. For example, websites often contain text with links to additional text, video or audio content. However, the transition from the first piece of media content to the additional content and then back to the first requires active user interaction such as mouse clicks. While the active input does provide the user the control over how they consume the content, active user input does not allow for a continuous or uninterrupted flow of multi-media content in a manner that can enhance the user's overall experience.
There are instances in which content types are combined without requiring active user input such as websites that include text and audio playing in the background or a movie that has visual and audio components. However, in these instances, the user lacks any control over the experience as the manner in which the various media types are provided to the user are defined entirely by the producer of the content and not activated by the user. For example, when viewing video content, the user has no input regarding the pace of the video, the direction of the story line or what portions of the story line the user wishes to focus on.
The selective combination of content without active user input is not easily achieved and requires assumptions that detract from the user experience. For example, playing audio for a specific portion of a text being displayed and not another portion cannot be accomplished without the producer making assumptions about when the user will be consuming the specific portion of the text and when the user is not. These assumptions do not provide an experience that is specifically tailored to the user and limit the complexity of the combinations of media types.
Systems and methods for passively obtaining user input are well known, including but not limited to eye tracking technology. Eye tracking technology generally falls into three categories. One type uses an attachment to the eye, such as a special contact lens with an embedded mirror or magnetic field sensor. A second type uses electric potentials measured with electrodes placed around the eyes. The third type uses non-contact, optical method for measuring eye motion. Optical methods, particularly those based on video recording, are widely used for gaze tracking and are favored for being non-invasive and inexpensive. Light, typically infrared, is reflected from the eye and sensed by a video camera or some other specially designed optical sensor. The information is then analyzed to extract eye rotation from changes in reflections. Video based eye trackers typically use the corneal reflection and the center of the pupil as features to track over time. However such eye tracking systems have not been adapted to the context of computing devices that an individual uses to consume media content in an everyday setting such as a personal computer, tablet computer, e-reader, video-game console, television and the like. To the extent that eye tracking systems have been implemented in personal computing devices, they have not been adapted for tracking a user's focus on a screen displaying a first piece of media content and automatically augmenting the content with additional, related media content of any type, thereby providing a seamless, user-controlled, multi-dimensional experience.
As such, what is desired is a system to selectively augment electronic media content with a variety of additional types of related content in a manner that is specifically tailored to the user consuming the content and to do so in a manner that does not require active user input such as a mouse click. Furthermore, it is desirable to augment and provide augmented media content to users across a variety of personal computing devices including tablet computers or e-readers, smart-phones, video-game consoles, televisions and the like.
For example, in the context of an e-book, it is desirable to have a tablet computer or e-reader that can present a book to a reader, and as the reader advances through the pages, augment the book by playing video and/or audio vignettes that pertain to the section, page, paragraph, lines and/or words being read by the reader. These augmentations can occur in many sections throughout the book and are caused by passive user input thereby providing an enhanced user experience without disruption of the user's consumption and through a device that is portable and convenient to use.
As a further example, in the context of a movie being displayed on a tablet computer, computer, video-game console or television, it is desirable to have a system that can present the movie to the viewer, and as the viewer is watching the movie, passively detect the portions of the movie that the viewer is most interested in or focused on and automatically augment the movie by playing audio or video content that pertains to those portions. Thereby allowing the user consuming the content to passively alter the manner in which the content is being delivered and even altering the storyline by merely focusing on one particular portion as opposed to another.
Furthermore, it is also desirable to have a system that is capable of providing media content, passively detect the portions of the content that the viewer is focused on and automatically augment the content with related advertising media without active user input.
One challenge faced by producers of media content is that the content ages and generally becomes less relevant as time moves on and entirely new content is added to the ever growing library of books, movies, websites, publications etc. As such, it is desirable to provide a system that allows producers of media content, such as e-books, to keep their content ‘evergreen’ by producing updated versions with new or changed additional content and making this content easily accessible through the internet. Additionally, it is desirable to provide a system that can be utilized to enhance and update existing content. This may apply, for example, to an updated edition of an existing book, or to any version of an existing book that is enhanced with video/audio content. Similarly, it is desirable to provide a system by which numerous individuals can augment existing content with unique integration of additional content or changes to the original content much like a producer can remake an existing movie to reflect that producer's interpretation of the original work.
It is with respect to these and other considerations that the disclosure made herein is presented.