A mobile computing device may include one or more of an image sensor (e.g., a camera) and/or an audio sensor (e.g., a microphone) to capture media data about people, places, and things a user of the mobile computing device encounters. Current solutions require a user to manually initiate the capture of media data. The drawback to these solutions is that a user must first decide to capture media data, and may miss capturing media data related to an “event” if the user is required to react to the event in order to initiate media data capture. Other solutions continually capture media for a certain time period in order to hopefully capture media data related to an event. The drawback to these solutions is that they may require an excessive amount of memory if the time period is large, and also require the user to manually review and segment the large amount of captured media data.