The present disclosure relates to editing digital audio data.
Digital audio data can include audio data in different digital audio tracks. Tracks are typically distinct audio files. Tracks can be generated mechanically (e.g., using a distinct microphone as an input source for each track), synthesized (e.g., using a digital synthesizer), or generated as a combination of any number of individual tracks. The audio data can represent, for example, voices (e.g., conversations between people), and other sounds (e.g., noise, music). For example, a particular track can include a foreground conversation between people and a background component that can include sounds occurring naturally in an environment where the people are speaking. The background component can also include sounds added to the environment to provide a specific effect (e.g., sound effects or music).
A track includes one or more channels (e.g., a stereo track can include two channels, left and right). A channel is a stream of audio samples. For example, a channel can be generated by converting an analog input from a microphone into digital samples using a digital analog converter.
The audio data for a track can be displayed in various visual representations. For example, an amplitude display shows a representation of audio intensity in the time-domain (e.g., a graphical display with time on the x-axis and amplitude on the y-axis). Similarly, a frequency spectrogram shows a representation of frequencies of the audio data in the time-domain (e.g., a graphical display with time on the x-axis and frequency on the y-axis). Tracks can be played and analyzed alone or in combination with other tracks. Additionally, the audio data of one or more tracks can be edited. For example, the digital audio data can be adjusted by a user to increase amplitude of the audio data for a particular track (e.g., by increasing the overall intensity of the audio data) across time. In another example, the amplitude of audio data can be adjusted over a specified frequency range. This is typically referred to as equalization.