1. Field of Invention
This invention is generally related to compression and generation of sound, graphics, lighting, camera angles, etc. This invention is more particularly related to the generation of sound based on starting sound clip having basic sound parameters which are genetically altered to create additional similar, but varied sounds. The invention is also related to the identification and organization of sound parameters allowing sound clips to be created via genetic programming, and also related to the provision of sound in entertainment and low bandwidth applications.
2. Discussion of Background
The use of sound enhances audio and visual experiences. Sound is recorded at the source, or generated by musicians and sound effects equipment. Algorithmic devices have also been utilized as compositional tools. However, algorithmically produced music is highly criticized for it""s lack of variation, making an environment that is detected by listeners as repeated, stoic, or simply unnatural.
Application of genetic algorithms (also referred to as evolutionary algorithms or evolutionary computation) to solve problems related to natural evolution or find a solution from a limited set of parameters has been of great interest in recent years, and has been successfully applied to numerous problems from different domains, including optimization, economics, ecology, and population genetics, for example.
A genetic algorithm is an iterative procedure that consists of a population of individuals, each one represented by a string of symbols or genetic codes that are matched (based on a fitness criteria), and mixed and mutated (based on probabilities), to converge to a solution (most fit individual or population). However, application of genetic algorithms as a compositional tool has not yet been proven or successfully applied.
VRML and many other methods and protocols (e.g., ftp, SGML, etc.) have been utilized for transmitting text, graphics, 3d environments, music, sounds, etc, to desktop PC""s connected over the Internet, an Intranet, or other network devices.
The Virtual Reality Modeling Language (VRML) is a file format for describing interactive 3D objects and worlds. VRML is designed to be used on the Internet, an Intranet, and other systems. VRML is also intended to be a universal interchange format for integrated 3D graphics and multimedia. VRML may be used in a variety of application areas such as engineering and scientific visualization, multimedia presentations, entertainment and educational titles, web pages, and shared virtual worlds.
VRML is capable of representing static and animated dynamic 3D and multimedia objects with hyperlinks to other media such as text, sounds, movies, and images. VRML browsers, as well as authoring tools for the creation of VRML files, are widely available for many different platforms.
Unfortunately, a two minute soundfile is relatively large and takes considerable bandwidth for downloading. Even with increased bandwidths expected with new communication devices, the amount of data contained in sound and video files, particularly when viewed as a percentage of all communications, is enormous.
For example, a two minute For example, a single two-minute 16-bit 22 kHz mono file, (one-quarter the size of CD-quality audio, which is 16-bit 44.1 kHz stereo) weighs in at 5 megs (uncompressed). Though many compression algorithms for audio exist, the standards are not always cross-platform and the compression is often either not at a very high ratio (4:1 is common) or it leaves the audio with undesirable, audible artifacts.
These artifacts cause further trouble if the sound file is then used in any but the simplest fashion. For example, a simple pitch shift (one of the common VRML audio behaviors) will often expose compression-induced artifacts.
One of the most common problems with Web sound is that it often uses the built-in (low-quality) MIDI sounds on aPC""s internal sound card. However, the use of full sound files for applications with bandwidth constraints is prohibitive. In most applications, algorithmically produced sound has not been acceptable for the above stated reasons. Streaming protocols, though showing improvement, do not easily allow interactive audio behaviors, and, as with other of the above stated methods, are of generally lower quality.
The present inventors have realized that genetic or shaped algorithms may be applied to generate auditory and other dynamic behaviors (e.g., lighting or camera viewpoints) within various environments (e.g., Internet, 3D graphical, VRML). For audio, small high quality sound files are recombined and their behavior altered to create a rich non-looping sound environment. Sonic parameters such as pitch, start time, intensity, and apparent spatial location are given initial values which are then changed. Realtime changes in the parameters to create new values may be implemented by a Java developed or other programming language.
In one embodiment, natural genetic type systems are utilized as a model for recombining the sound files to produce related but non-repeating sounds. Sound parameters (also referred to as sonic DNA, audio objects, or genes) are combined and/or altered via rules to produce resultant audio objects. The resultant audio objects and behaviors can be used in 3D graphical environments or in real environments, and can alter themselves in response to changes in these environments, whether systemic or user-initiated. The resultant audio objects and behaviors can be used in 3D graphical environments, e.g. to produce a long-running non-looping thunderstorm in a VRML world, or in real environments such as live theatrical productions or the workplace. These audio objects and behaviors can alter themselves in response to changes in these environments, whether systemic or user-initiated. For example, a genetically produced sound environment may reflect the health of computer printers in a workplace; if one printer is down, or has an overlong queue, its signature sound can reflect that information within the sound environment by changing itS relative volume, rhythm, or pitch.
The algorithms that create sound may also drive other dynamic processes such as generation of new geometry, lighting, viewpoint, or animation, in conjunction with the sonic behavior. The present invention may also be applied in awareness or entertainment systems in real environments, or for any system where information is delivered sonically over a long period of time.
The present invention is particularly useful in Web-based environments like VRML or Java3D, as it can eliminate the necessity of downloading large sound files. The present invention also produces non-looping sound environments in awareness systems (An auditory awareness system is one which provides to the user/listener certain data within a sonic environment. It serves rather like a set of alarm bells, except that it""s designed to be listened for over long periods of time (weeks, years) and it carries multiple layers of data. One good example of a personal auditory awareness system is Audio Aura (Mynatt and Back, 1977).) and entertainment systems, interactive encyclopedias, books, games, etc.
The present invention includes a device, comprising, at least one sound node having data stored therein defining at least a single sound, a alteration mechanism configured to at least one of alter any of the existing sound nodes, and produce new sound nodes based on the existing sound nodes, a scripting mechanism configured to direct at least one of alteration and playing of said sound nodes, and a playing mechanism configured to play sounds defined by said sound nodes. The invention includes a method, comprising the steps of, receiving a node having a set of parameters, outputting (A) information based on said set of parameters as a current output, altering (B) the parameters of the current output to produce a new related set of parameters, outputting (C) information based on the new related set of parameters as the current output, and repeating said steps of altering (B) and outputting (C). The invention may also be applied in a method of supplying content to a client device, comprising the steps of, sending a first set of parameters to a client device, and sending a genetic algorithm to a client device, configured to create related parameter sets by at least one of altering and combining said first set of parameters.