This invention relates to the long term automated collection of audio and data, data normalization, visualization, and ability to find audio both visually and via query in a continuous audio stream that may span years.
Handheld and wearable computing devices have the ability to collect large amounts of diverse data. Low power processing, storage, and radio communications technologies now enable long-term wearable electronic devices in medical and other fields bringing about the creation of devices such as wearable fitness trackers. These devices deal with relatively small data sets and therefore have small storage and computing requirements. Existing devices automatically take photos, the linear progression of which will be a constant photo stream which becomes video. Compression, storage, and efficiency have not reached a level of efficiency necessary to enable permanent wearable video recorders of acceptable quality and size.
Existing audio recorders can be used to record a day of audio. By manually copying mp3 or similar audio files from these recorders to a PC one could create a somewhat continuous audio record. The disadvantages are that it is up to the user to manage the storage and organization of large mp3 files and the user has no method of extracting audio segments relevant to a particular query. For example, if one were to record 1 year of mp3 files using existing recorders, one would have 8760 hours of audio. Existing technologies do not provide the ability to intelligently manage and mine 8760 hours of audio for user-relevant content.
One key aspect of wearable device power optimization is the ability to run data collection and analysis processes on computing devices which are power-optimal for the given process. In order to provide the smallest and longest lasting wearable device it is necessary to maximally offload processing to other more powerful devices. For example, fitness trackers don't implement interfaces for managing data, posting results to social networks, etc. Those power and size intensive functions are offloaded to mobile phones or desktop computers.
As technology progresses to allow for collection a larger variety of data types and data volume, the primary problem becomes one of data mining. Domain specific user friendly data mining tools can provide new and useful possibilities for examining otherwise monolithic and overwhelming data sets and allowing the user to turn that data into actionable information.
Some existing systems address audio recording and geotagging but their implementations have many disadvantages in common. Perhaps the most obvious and fatal disadvantage is that they provide minimal data mining capabilities. Another disadvantage of existing systems is that the user has to make the decision to start recording. Another disadvantage of existing systems is that they employ geotagging mechanisms. Most systems geotag data by (1) adding metadata to the data and (2) thereby creating a data->GPS reference during data collection.
U.S. Pat. No. 6,282,362 considers a geographical position/image capturing system. In addition to the aforementioned disadvantages, the system has other significant disadvantages. The system does not consider the possibility of spatial movement during an audio recording. In addition, there is a single digital recording unit which records GPS and video/audio. The user “may also record audio segments during the image capture by activating suitable control switches or buttons”, making audio optional short term secondary data. Furthermore, the system does not have the ability to record audio without recording images, thereby treating audio as secondary data since audio is a short audio clip associated with the user taking a new photo. The system has a single playback unit and the system is a single user/recorder/viewer system. Digital audio data is automatically geo-tagged during image capture. In general, the system depends on “hyper-media links” in order to show an image or play audio. The system only allows selection of data from a map and is “A geographical position/image capturing system”. Lastly, the system embeds metadata in an image/audio in order to prove authenticity.
U.S. Pat. No. 8,923,547 defines “Geotagging is the process of adding geographical information to various media in the form of metadata”. In addition to the aforementioned disadvantages, the system has other significant disadvantages. Geotagged metadata is added to mp3 files. It creates a fixed image (JPEG/PNG map) which has location information about the audio drawn on it over the map. It considers the possibility of spatial movement during an audio recording and supports showing the path for that recording. However, the system creates a fixed rendering of the entire path of the entire audio recording and there is no mechanism allowing the user to know exactly where one was during a specific time of the recording. The system will make a long term audio recording with map rendering unusable to the user since it will include a myriad of locations which can overlap or repeat at different times over a period of hours or days. It also does not consider recording audio without recording images, or vice versa.