MP3-Encoding
MP3 compression uses four primary techniques in order to save space. In case you’re interested how it works, here’s a short explanation:
1. Perceptual Coding: Sound has many components to it that are recorded, but cannot be easily heard by the human ear. An example is masking, when one sound is so much louder than another, the louder one is the only one heard. Ever hear a record with pops and scratches? It’s unlikely you can hear those pops during a screaming guitar solo, so you can leave out those quieter sounds. Perceptual coding makes these and other approximations that sound the same to a human ear. All of these are lossy methods, but if done right the finished product will not be perceptibly different.
2. Huffman Encoding: Huffman encoding is the same technique used to make ZIP files. For example, it can take a repeating pattern (like 0000 0000) and expresses it in a shorter manner (such as 0r8 where r refers to the number of times to repeat the same number.) These methods are lossless.
3. Joint Stereo: Stereo recordings have two tracks. In most stereo recordings, the sounds on both tracks are similar enough on both left and right to allow you to reduce the amount of information you need to store about both tracks. Most implementations are a lossless encoding technique, although there are lossy implementations. You should avoid the lossy forms.
4. Bit Reservoirs: After saving some bits by using these methods above, the encoder uses them later in the encoding for fixed bitrate formats. This is part of the story behind VBR encoding which allows you to use less bits to record simple sounds, and more bits for more complex ones. Some implementations of CBR encoders use this as well, however.
MP3 Encoding Concepts, and suggestion of optimal settings.
Bitrate: The bitrate for an MP3 encoded file determines the overall sound quality. It is measured in kilobits per second. The higher the number, the higher the quality of the sound, but also, the larger the file size. MPEG-1 Layer 3 files can have one of the following values: 32, 40, 48, 56, 64, 80, 96, 112, 128, 160, 192, 224, 256 and 320 kbit/s.
Bitrate options: The simplest MP3 format uses just one bitrate for the entire file. This is called Constant Bitrate, or CBR. You can also set the encoder to automatically lower the bitrate for simpler sounds (such as silence) and raise the bitrate when the sounds are more complex. This is called Variable Bitrate or VBR, and can save size while doing a better job of capturing complex sounds.
Sampling Frequency: Computers record sound by sampling the source. Rather than capturing a continuous stream, like a tape, digital recording samples the sound thousands of samples per second (which is called kHz). The higher the rate, the better the quality. You must play it back at the same rate that you sample it. The available sample rates for an MPEG-1 Layer 3 file are 32, 44.1 and 48 kHz.
Stereo vs. Joint Stereo: There are two options for stereo: the normal stereo setting, and Joint Stereo. A normal stereo MP3 has two tracks, a left and a right track. Joint Stereo saves space by exploiting similarities of the left and right channel reducing the need for capturing the same information twice. The most common type, the Mid/Side Joint Stereo method, is a lossless method of compression. The intensity stereo method is lossy, and should only be used when extreme size reductions are necessary. It’s usually not used, and we recommend that you avoid it.
Table of Optimal Settings
Now that we’ve introduced these concepts, here are our suggestions, with an explanation of each. These were born out of trial and error while we encoded over 400 songs of our own music.
|
Setting |
Suggested Value |
Explanation |
|
Format |
MP3 |
While there are other standards such as MP3 Pro, AAC, and others. none of them are as universally compatible as the MP3 standard. You want everyone to be able to play your files. |
|
Bitrate |
128 kBit/s |
Anything lower is poor quality. Anything higher takes too long to download. |
|
Bitrate Option |
CBR |
VBR is better, but not all players handle it properly. |
|
Sampling Frequency |
44.1 kHz |
44.1 kHz is CD quality, and most players default to it. Playing a 48 kHz file at 44.1 results in the song sounding “slow.” Try playing a 48 kHz file in a player that uses 44.1 kHz so you can attune your ear to it, and catch mistakes in your encoding. |
|
Stereo Option |
Joint Stereo |
Joint stereo saves you space, usually without losing any quality in modern implementations. |

