Disclaimer: I'm by no means a professional mixing/mastering engineer (in fact, quite the opposite ;) ). I do like to research stuff, and will occasionally report on what I find out here. The explanations may contain factual errors, in which case I'd appreciate if you took the time to report errors or inaccuracies. Given the huge field that audio mixing really is, this post will necessarily skip a lot of details.

Why did you write this post?

I hear many problems in my own recordings, and have decided I should educate myself a bit on the subject. This is mainly written as a summary of things I read and remembered.

Mixing? Mastering?

First things first: what's the difference between mixing and mastering?

In mixing one tries to bring together different tracks into one recording. In doing so, one can apply a whole range of effects to each track separately before combining them to a complete song. I intend to discuss some of those effects in this post.

In mastering one takes a mixed song and applies effects to the whole mix at once. This would be done e.g. to make all songs on an album share a similar hear and feel.

A photoshop (or gimp ;) ) analogy would be that in mixing you combine clipart into a picture, and in mastering you apply effects to the picture as a whole (cropping, change colors to sepia, ...).

My current investigation is mostly about mixing audio.

Mixing objectives

In mixing audio one strives to create interest, mood, balance and definition. These four dimensions can be heavily influenced by cleverly applying effects to each of the tracks.

Interest: is all about keeping the listener's attention by adding enough variation in the mix. Example: make the chorus sound subtly different than the verse, or make verses with dark lyrics sound darker than verses with lighter lyrics. Variation keeps the interest higher. You can also decide where to put the instruments in an (imaginary) 3d audio scene. If you pay attention to recordings, you can start to hear how certain instruments seem to sound as if they are placed on different places on a sound stage.
Mood: how do you make the same music sound darker or lighter? More mellow or more aggressive ?
Balance: make sure each instrument gets the space it needs and make sure that the instruments don't sound like a bunch of aliens that happen to play simultaneously. The different instruments should sound as if they belong together, and none of the instruments should overpower all of the other instruments.
Definition: make sure enough details can be heard in each of the instruments, voices, and at the same time get rid of the unwanted details like breathing.

Mixing techniques

In order to achieve those objectives, we can apply effects to the individual tracks that must be mixed together. Some effects will operate on the time domain (i.e. it will influence how volume changes over time, or how long certain sounds remain hearable), whereas other effects will mostly influence the frequency domain (change how a specific sound sounds, e.g. make vocals sound clearer or darker).

Phase

In itself, the phase of a single track is pretty meaningless. Sound is made of waves, and phase says something about at which moment in time the wave reaches its peaks (low and high), and passes through zero. Phase starts to matter when you combine two or more tracks. A phase difference between two tracks is a small delay between the two tracks. Funny things can happen when you mix sounds with different phase. When you mix two tracks, you basically sum them. If you take one track, make a copy of it and apply phase inversion to the copy, then mix both tracks together you end up with no sound at all: the phase inversion causes both tracks to cancel each other out. (Indeed when the original reaches its peak, the phase-inverted copy reaches its valley and vice versa. At each moment in time the waves are each other's complement.) Of course this is an extreme example that you would never encounter in practice. But if you record the same track simultaneously with different distances to different microphones a phase difference could be present, and if it is not compensated before mixing, it might lead to unwanted (partial) cancellations of sound. Applying certain effects will also affect the phase of the track. Mixing together tracks with phase differences will result in something that sounds a bit different (usually worse) than what you expected, typically a more metallic, hollow sound. Sometimes this effect is applied on purpose: then it's called flanging. There's another related effect called phasing. Both effects suppress certain frequency components (a phenomenon called comb-filtering takes place). In flanging the frequencies that are suppressed are harmonically related, in phasing they are not. Phasing is usually a little bit more subtle than flanging.

Sometimes you can play with phase to achieve interesting effects. One such effect is known as the "Haas" effect. Basically you take a track, and hard-pan it to the left speaker. Then you take a copy of that track and hard-pan it to the right speaker + let it play starting a few milliseconds after the first track. As a result you get a very spacious open sound. Try it out in your favourite audio tool. Another trick is the out-of-speakers trick where you keep the tracks time-aligned, but you invert the phase of one of the tracks. This results in sound that seems to come from all around you. Works best with low-frequency (i.e. low notes) sounds.

Fading

Fading is adapting the volume of your track, in order to give each instrument equal chances of being heard. To add interest to a recording, many programs allow automating volume levels, so that variations can occur throughout the song. Beware though: humans are only human, and psychoacoustics dictate that "louder" gives the impression of sounding "better". The result is often that volumes tend to be increased, to the point where they don't make sense anymore, or don't leave enough room for other instruments to be added. If volume is set too high, also clipping can occur, which results in considerable distortion (typically clicking or crackling sounds) in the end result.

Panning

With panning you can give different instruments a different spot on the virtual sound stage you are creating in your mix. Its effect is to make the sound come more from the left or from the right. When speaking about panning, one often refers to the sound as coming from a different "hour". Hard left would be 7:00, hard right would be 17:00 and right in front of you would be 12:00.

Compressors and other dynamic range processors

The word compression has two different meanings. On one side it is used to denote the process of representing digital recordings with fewer bytes (like .mp3 is a compressed version of .wav). This is not the meaning that is used in audio mixing. In audio mixing, compression means something different: it means reducing the dynamic range, i.e. reducing the difference between the loudest and most silent parts of a track. Like that the recorded track can blend better with other tracks. Compression is often used on vocals: without it, the more silent parts of the singing risk to drown in the sounds of the other tracks. In this context: apparently in the recording industry there's an ongoing loudness war: by (ab)using compression, recordings are made to sound as loud as possible. The downside is of course that a lot of dynamic range is lost that way and the music becomes less interesting as a result. Different applications of compressors include:

Compressor: make loud sounds more quiet (while keeping a sense of louder and more silent); keep silent sounds at the same volume
Limiter: ensures that the volume never exceeds a given maximum. The volume of any sound that is louder than some threshold is brought back to the threshold. Volume of sound that is more silent than the threshold is kept as-is.
Expander: making quiet sounds quieter; keep louder sounds at their original level
Upward compressor: make quiet sounds louder, keep loud sounds at their original volume
Upward expander: make loud sounds even louder, keep silent sounds at at their original volume
Gate: make all signals with a volume below some threshold a lot more quiet (with a fixed amount known as the range; often: completely remove)
Ducker: make all signals with a volume higher than some threshold a lot more quiet (with a fixed amount known as the range

Compressors can have some unwanted side effects, producing varying noise/hiss levels (breathing) or sudden noticeable level variations (pumping). In case of extreme compressing one also loses dynamic range, to the point of making the music less interesting. When used properly, compression can make sounds denser, warmer, louder. Compression can move sounds forward and backward in the virtual sound stage. To a certain extent it can be used to remove sibilance.

Equalizing

Equalizers can have various effects on your sound. They are not so easy to use effectively. They can influence separation (hearing details from individual instruments), feelings and moods, make instruments sound different, adding depth to sound, suppress unwanted content (like constant background noise or humming). To a certain extent they can also compensate for bad recordings, or to suppress unwanted sibilance. Sibilance is the piercing sound produced when recording fricative sounds ("s", "sh", "ch", "t") which can be quite disturbing in a song.

Equalizers typically work on a part of the frequency spectrum. Depending on the part of the frequency you operate on, and the kind of operations you do on it, you get wildly different effects from the equalization. In short: equalizing is like the swiss knife of audio mixing, and it requires (a lot) more investigation from my side.

A rule of thumb seems to be that equalization should be done after compressing.

Reverb

Ever noticed how the same instrument can sound very different in a different room? One of the factors that determine this difference in sound is the reverb. As you emit sound, it travels through the room and bounces off walls and furniture. Some frequencies will be absorbed by the materials in the room, others less so (this is a kind of natural equalization taking place). After a while, delayed and filtered copies of the sound will arrive at the sound source again (reflections from the walls).

Different ways exist to add reverb to a signal. One interesting technique involves using an impulse response of a room. Think: you clap in your hands, and you record the sound with all the echos this makes. This is more or less the impulse response of the room. Now you can apply this same echo to any other sound using a mathematical operation known as convolution. So if you have the impulse response of a famous concert hall, you can make your piece sound as if it was played in that concert hall. The downside of convolution with an impulse response is its computational requirements, and its inflexibility: there are no parameters you can tweak. For this reason also other algorithms have been developped that allow more flexibility in reverb definition (e.g. choosing the room size). Convolution with the impulse response will more automatically result in a natural sound. Note that in principle you could also apply convolution between any two samples (say: an oboe and a talking voice). The result is cross-synthesis, i.e. the oboe that seems to speak the words of the voice.

Reverb is used to add depth to a recording. A common reverb applied to all tracks can make them sound more compatible, fit better together in the mix (on the other hand, applying different reverb to different tracks can help to increase the separation between instruments). Reverb can fill up pauses in the sound. It also contributes to the mood.

Delay

Delay delays a signal with a given time. Mixing the original signal with the delayed signal creates an echo, but if the delay is short enough, we won't perceive the mix as two distinct copies of the same sound. In case of very short delays: be careful of phase differences between the original signal and the delayed copy: they can lead to unwanted side effects during mixing. With slightly longer delays you can get an effect of doubling (i.e. the basis for a chorus effect; making a less dry sound). With longer delays, you create an echo effect. See also the explanation about the "Haas" effect in the section about phase, which is another application of delay.

Vibrato and tremolo

Two kinds of vibrato exist: frequency vibrato and amplitude vibrato (sometimes called tremolo). In frequency vibrato one rapidly alterates the pitch, in tremolo one rapidly changes the volume.

Distortion

Distortion adds to aggressiveness of the end result. It is also a way of adding imperfections to sound, rendering it less boring (when applied skillfully ;) ). The easiest and least subtle way to add distortion is to clip sounds to a certain limit. Other techniques include applying amplifier simulators and bit reduction techniques.

Pitch shifting, time stretching and harmonizing

Pitch shifting and time stretching are very closely related to each other (at least from a mathematical point of view). If you play back the same recording faster (e.g. by selecting a different sampling rate or by making some tape run faster), it will become shorter, but also its pitch will increase. Sometimes you want to make recordings faster or slower without affecting the pitch. This is not easy to accomplish, and different instruments typically require different algorithms to get a convincing result. Also when the stretching or pitch shifting is very extreme, sound quality will clearly suffer. Even with the best algorithms. Pitch shifters are useful to correct instruments and vocals that are off-tune. They can also be used to turn a piece with one voice into a choir piece with multiple voices singing different pitches simultaneously (harmonizing). The algorithms used typically offer some parameters that allow you to create special effects as a side effect of the pitch shifting or time stretching.

Conclusions

This is more than enough material for this post. As you can see there's more to mixing audio than just throwing different tracks together. This post only scratched the surface of what mixing is all about. The real difficulty starts when, presented with some recordings, you have to make sense of it all: decide which effects to apply, how to configure their numerous parameters, in what order to apply them, etc etc. in order to reach a desired end result.

Ideas for future posts (but those may may never get written, or at least not in the coming years, since I still have no experience with all these things):

in-depth discussions of single effects, illustrated with ladspa or other plugins in some popular tools (audacity, snd, ardour,...?)

topical discussions, like: how to clean up vocals

Maybe all these tutorials and discussions already exist somewhere, in which case I'd be happy to see some links in the comments section.

A touch of music

Saturday, March 17, 2012

Mixing and mastering