The qualities we have described so far refer to the frequency content of sound, which is certainly of great importance, but there is more in a complex sound than frequencies. The way a certain sound starts, or finishes, its volume variations, also help to determine what kind of sound we are listening to. Let's give you an example: this three sounds have the very same frequency content, but we can't say they are identical, can we?
Obviously, the second tone has a much smoother start and end than the first one, and the third one, even if it is smoother than the first, is not as smooth as the second. Again, the frequency content is the same, so what has changed? In one word: the envelope.
The envelope of a sound can be defined as the sound's volume variations in time. There are four stages that can be identified in the envelope of a sound: attack, decay, sustain, release. Let's define these stages and then we'll see how they work:
Attack: is the very beginning of a sound. How quickly a sound gets from silence to its loudest peak
Decay: how quickly the volume of the sound is reduced after its loudest peak
Sustain: how long is the sound's volume maintained after the decay
Release: how quickly the sound's volume gets back to zero
For example: a pipe organ has
a fast attack (pressing a key will activate an air flux through a pipe very quickly)
a very slow decay, if any (once the key is kept pressed the volume tends to stay constant)
a long sustain (the sound's volume does not decrease until the key is released)
a fast release (as soon as the key is released the sound ceases)
On the other hand a guitar has:
A fast attack (as soon as we pluck the string we hear the loudest peak)
A fast decay (the volume stabilises quickly to a lower level after the first peak)
A shorter sustain (once the string is plucked the volume decreases progressively)
A long release (which is almost indistinguishable from the sustain)
A violin has:
A slow attack (the bow makes the note start very smoothly)
a very slow decay, if any (once the bow is producing the note the is almost no volume variation)
a long sustain (the volume keeps constant as long as the bow is producing the note)
a fast release (as soon as the bow is detached from the string the sound ceases)
These can be visualised as such:
So this would be the envelope of our first tone
This would be the second
And this the third
This is an organ's note waveform and its envelope
This is a guitar's note waveform and its envelope
How does the envelope relate to our perception of the volume? Probably you already have an idea at this stage, having listened to these examples, but let's summarise it:
The attack time does not generally relate to how loud the sound is perceived: it contributes more to the 'character' or 'style' with which a sound is played.
The following sounds have very different attacks, but their volume is almost the same.
What makes us hear a sound as 'loud' is more the sustain stage, which is when the volume stabilises to a certain level.
Hence, we can also conclude that:
Brief and momentarily peaks of high amplitude levels (also called 'transients') are not generally perceived as 'loud' but rather as 'punchy', 'vigorous', 'vivid'
Longer periods of constant high amplitude levels are on the other hand perceived as loud.
The distance between the highest and the lowest peak of a sound is called dynamic range. This is a very important quality. As we will see, the dynamic range of single sounds and of the whole mix is something we should really care about while mixing.