In recent years a new way of presenting information has been established. In this new multimedia approach, information is presented by combining several media, e.g. written text, audio and video. However, when using e.g. the audio data, finding and addressing specific structures (pages, chapters, etc. corresponding to the equivalent textual representation of the audio data) are either time consuming, complex, or impossible. A solution to overcome these problems is to link text and audio. The concept of linking text and audio is already used by some information providers. However, it is not widely used. One of the reasons for this is that it is a resource consuming process to build the hyper-links between the audio data and the corresponding textual representation. This either means a huge investment on the producers side, or a limited number of links, which limits the value for the user. As a result of the limiting state of the art user queries directed to databases containing multimedia material have to be in most cases quite general. For example a user asks “In which document do the words “Italian” and “inflation” occur?” A response to this query results in the complete audio document to be returned in which the requested data is enclosed.