As society becomes increasingly more information-dependent, it is becoming imperative, both for business and personal reasons, to provide and even receive information immediately and conveniently, whereby “convenience” is measured by the various media by which the information may be shared. A prominent implementation of such immediate and convenient information dissemination includes electronic-mail (hereafter referred to as “e-mail”), which is the transmission of memos and messages, including text memos and messages, over a network. E-mail may be sent to a single recipient or broadcast to multiple users. E-mail messages may be sent to a simulated mailbox in the network mail server or host computer until the individual messages are interrogated and deleted. Further, text e-mail memos and messages may include file attachments that may include additional text, audio, video, programs, spreadsheets, graphic attachments, etc.
But, even though e-mail messaging has become a tremendously popular and common method of communications, e-mail usage still has not surpassed telephone usage, particularly in view of the surging popularity and sophistication of mobile telephone systems. However, telephone voice-messaging systems, associated with both PSTN-based (public switched telephone network) and wireless telephone systems, do not have the ability to provide users with voice-message attachments, which may include, but not be limited to, further audio messages, video, text, programs, spreadsheets, graphic attachments, etc.
Processing of potential attachments to either e-mail or voice-mail has become more sophisticated including conversion of, for example, video to text and graphic attachments to their basic components. In the art of video—in particular, video transmission of movies—it is known from U.S. Pat. No. 5,677,739 to provide an audible description of video scenes and action for the sight-impaired. This video-to-audio conversion capability would be especially useful in a voice-mail environment. Moreover, research has been performed at AT&T in the art of breaking displayable objects into their components for representation or audio description. For example, a logo used in a letterhead may be described audibly to a voice-mail user in addition to the letter being read via known text to speech conversion. Consequently, there is both a need and an opportunity to provide more meaningful attachments of various media to voice-messages.