ÜberStandard
Music (and people) worth listening to.
Über Guide About Tools Knowledge Base
FLAC, MP3, Vorbis and the Future of Encoding
This article examines the most popular high-quality audio encoders and explains why VBR LAME MP3 with the -V2 option was chosen for the ÜberStandard. It's important to note that each of the encoders addressed has its own set of relative strengths and weaknesses, and that there is no "best" encoder; each may be the "best" in different respects. Rather than try to make a case for the ÜberStandard as being superior in every way to all of the alternatives, the aim of this article is instead to soberly address the reasons why we believe, in the vast majority of applications, it offers the best set of compromises.

Analysis Criteria:

  • Perceptual transparency is the lowest acceptable audio quality level. (If a human judge can consistently tell the difference between the encoded file and the original, then the encode quality is too low.)
  • To prevent biases, guesswork, and the placebo effect from contaminating results, all quality judgments considered must be based on blind tests.
  • Encoders producing small files are preferred over those producing significantly larger files.

Perceptual Transparency

Bitmap original:
171 KB
Jpeg at 95% quality:
47.5 KB

In data compression, perceptual transparency is the level at which a human observer (or a panel of them) cannot distinguish between the compressed file and the original. The two images above, of the Scottish Royal Coat of Arms, show an analogous visual example. The one on the left is the bitmap original with no compression applied and the other is a jpeg file compressed at Photoshop's "Very High" quality. The jpeg file size is almost 1/4th that of the bitmap, a tremendous space savings, yet without computer assistance and intense magnification, it is not possible to tell them apart. In short, this is the aim of the ÜberStandard, to get file sizes as low as possible without degrading perceived quality.

Note: All people are slightly different. There is a tiny minority of individuals (maybe less than 0.01%) within the audiophile community who possess "golden ears" and have what appears to be super-human hearing. Yet even these extraordinary people find it exceedingly difficult to distinguish our Über Music from the uncompressed originals in blind listening tests.

FLAC
FLAC

The Free Lossless Audio Codec, or FLAC, is different from MP3 in that there is no loss of information: a compressed FLAC file is identical to the original. It's actually much more like the ZIP compression algorithm except that FLAC is fine-tuned exclusively for audio files. There are absolutely no audio quality issues with FLAC; because it is lossless, even a computer can't distinguish it from the original.

So is FLAC the fulfillment of our dreams?

Hardly... Assuming an average length of ~50 minutes per CD, the average ÜberStandard MP3 CD is ~75 MB, while the average FLAC CD is a whopping ~300 MB, yet in blind comparisons, it is not possible to tell them apart. We believe that the 225 MB difference per CD (on average) is wasted space and decidedly un-Über in all but the rarest of circumstances. FLAC's principal advantage (its lossless algorithm) means that it can be decompressed, edited, then compressed again, and so on with no quality degradation; doing that with a lossy codec like MP3 would result in unacceptable generation loss. This makes FLAC an ideal codec for production engineers who routinely re-compress audio and must avoid generation loss, but the poor file size reduction it achieves (by comparison) makes it a very bad choice as an end-state encoder. In other words, if you're a recording artist or a video editor for a major motion picture studio, then FLAC is a good choice; but if you're ripping music to listen to it rather than to edit it, FLAC is an awful choice.

A superficially compelling argument has been made, that FLAC is viable for audiophiles because it is future proof. By that, the advocates of FLAC mean that by virtue of its ability to be decoded and re-encoded without generation loss, users will hypothetically be in a position to take advantage of future advances in lossy encoders. But this argument is specious for several reasons. First, even if breathtaking advances in lossy compression technology DID occur within a human lifetime, the prior argument in favor of FLAC would still hold true. Proponents of lossless encoders could still claim that audiophiles should steer clear of lossy encoding, no matter how good it is, because in the future it could be even more wonderful. Therefore, FLAC users who follow this line of reasoning could NEVER appreciate the benefits of lossless encoders and the argument falls apart.

Arguments have also been made based on the future uncertainty of FLAC itself: maybe lossless encoders will dramatically improve and music compressed by earlier versions of the codec could be re-encoded with an updated and vastly superior release of FLAC. This argument is also a fallacy, because the future of lossless compression is not uncertain at all. Shannon's Source Coding Theorem of Data Compression describes how mathematical laws impose upper limits on the maximum effectiveness of compression. The hard work of software engineers will undoubtedly lead to minor improvements in the efficiency of FLAC, but it is in fact mathematically impossible for a breakthrough in technology to significantly reduce the file sizes produced by lossless audio compression. In a very real sense, with lossless compression technology at least, the future is now, and it's far from the best choice in most cases.

aoTuV (ogg) Vorbis
Vorbis

Vorbis is a lossy audio codec similar to MP3. A meta-analysis of many audiophile listening tests seems to indicate that Vorbis achieves perceptual transparency at a slightly lower bitrate than mp3. However, it appears as though both LAME and Vorbis, being well developed encoders, are both asymptotically approaching the maximum effectiveness of compression at the level of perceptual transparency as described by Shannon's Theorem. (There is a certain degree of complexity which is mathematically impossible to remove without degrading quality below acceptable levels.) Furthermore, there is some empirical data which indicates that a slightly superior fine-tuning within Vorbis has improved encoding for some kinds of music (classical most notably) and that it has led to marginally better overall test averages. It seems like the level of perceptually transparency using either MP3 or Vorbis hovers around 180Kbps, a little more for some types of music and a little less for others. (Especially with regard to classical music coupled with Vorbis.) Clearly though, a standard should not be set based on performance within one genre of music alone, and with either Vorbis or MP3, the ÜberStandard would need to be set closer to 190Kbps+ target bitrate (LAME's -V2) for a safe margin of error, and for consistent performance across all music genres.

It looks like LAME MP3 and aoTuV Vorbis are roughly tied in terms of overall perceptual transparency level, they both require a 190Kbps+ target for consistently transparent results, but Vorbis has some other considerable drawbacks. It's slightly more demanding on system resources to decode, and it's compatible with far fewer devices and software titles. It's a close runner up, but in the final tolling it's slightly inferior unless you're encoding classical music exclusively with no regard for compatibility issues.

LAME MP3
LAME MP3

We believe this represents the very best in audio compression. It's by far the most popular audio format in existence, it's supported by nearly every portable audio player ever made, there's a huge number of software titles for organizing and editing meta tags, and it achieves a compression-to-quality ratio that's at least comparable, to the less popular lossy alternatives. It has a terrific implementation of volume consistency called ReplayGain, which alone strongly encourages, if not warrants, choosing MP3 as the de-facto standard. Furthermore, due to widespread acceptance and online distribution, in another decade the up-and-coming software titles and handheld devices will surely support MP3, it's a standard that's here to stay. Little of that can be said for sure about the alternatives.

We can confidently say that it's safe to compress music using LAME's -V2 option, throw the source CDs away, and never look back. The future improvements to audio compression technology that can reasonably be expected will be so minor that re-ripping will not be worth the time involved. We know this because it's been true since late 2001 with the release of LAME version 3.90 (provided proper extraction was done in the first place), and because there are mathematical laws limiting how effective audio encoding can possibly be. Compression is about as good now as it's is likely to get, so rip using the ÜberStandard and enjoy.


References:
Audio Codec Quality Shootout
Multiformat Listening Test
Josef Pohm's Lossless Comparison
Sebastian's Public Listening Tests
Group Listening Tests of Various Formats
Guruboolez MPC vs Vorbis VS MP3 vs AAC at 180 kbps
MP3 WMA AAC OGG
Radified Guide to Ripping AND Encoding CD Audio
Lossy audio formats comparison
Codec Comparisons