Thursday, June 24, 2010

My complete guide to digital audio


A lot of people who see/hear my home audio setup like how I have all of my music available streaming from my computer.  I can play any album I own and can stream it to any device in my house (or remotely).  I have multiple convenient ways to control all my audio devices depending upon what I'm doing.  Finally, I am a bit of an audiophile (and love my music), so I've gone to a bit of trouble to preserve all of the quality from my audio CDs.  I've invested quite a bit of money in my collection and plan to enjoy them for many years to come.

It has taken me a while to get a system/method that I'm happy with (being a perfectionist) and I thought I'd share this setup with others.   This is my complete guide to digital audio.

By complete I mean this guide covers everything that I want to do with digital audio (CDs) step-by-step:
  1. The Perfect Rip - copy CDs to your computer
  2. Format Conversion - turn your CDs into MP3s, AACs, etc.
  3. The Ultimate Burn - create perfect audio CDs
  4. Modding SqueezeCenter - for on-the-fly HDCD processing
  5. Backup all my CDs (and other Media) and keep them up to date
  6. Finally, I have some tools for checking for missing album artwork, duplicates, corrupted files, etc.
This is far too much for a single post, so I've broken it up into separate parts.

(This) main post will talk about some of the challenges/choices with digital audio and how I approached the problem.  For the actual instructions per step, click on the links above.

NOTE:  (more) LINKS above coming soon....

BTW: The cornerstone for playing my music is currently the Squeezebox product line which I've blogged about.  However, I'm confident that if I switch to some new music system in the future, the formats and steps that I've chosen will be easily adaptable.

If you run Windows (XP, Vista, 7),  the steps above should work directly for you.  Hopefully they aren't hard to follow, and feel free to ask questions/comment.  If you run a Mac/Linux, you probably can use some of this with a Virtual Machine (or with Wine).  All the software I use is free to download (including the scripts that I wrote).  At some point, I'll probably post similar instructions using native tools on Linux/Mac.

Finally, there are other digital audio guides and these steps are clearly not for everyone.  Feel free to pick and choose anything of value that you may find.  I admit to being obsessive compulsive, but that is half the fun of technology when applied to some truly useful purpose.

First, a little background

Two lessons I learned early in my career:
  1. Always preserve any original data you collect, unmodified, without loss, whenever possible.
  2. Nobody has time to do it right.  When (not if) things go wrong, we find time to do it again.
What do these have to do with digital audio?  When I first ripped my audio collection, I thought I was being quite clever and compressed my music with the Lame encoder at a high bit rate.  I was confident that I probably couldn't hear any difference in using a better encoding scheme (I wasn't aware of lessless encoders and drive space wasn't cheap yet).

However, I was really disappointed one day when listening to my jazz albums when I heard clicks (and in more than one album).  Whenever I played my original CD, it had no such clicks.  How could this be?  CDs are digital and transcoding them to MP3s shouldn't ruin the sound like that.  This was unacceptable.

Of course, now I began to wonder if the effort I had put into ripping my entire CD collection had been wasted (Obvious answer: yes).   However, it drove me to dive a bit deeper into what is really going on when you rip a CD and how you can do it better.

Many challenges to ripping CDs

First of all, when people rip a CD there are two distinct processes usually going on.  They are:
  1. Digital Audio Extraction (or DAE)
  2. (optionally) Digital Audio Compression
DAE is the process of digitally reading the audio data off of the CD.

Also, since audio data on CDs are uncompressed, usually people want to make the file sizes smaller.  This can be done in one of two ways:  Lossy (bad) or Lossless (good).  More on that in a minute.

How can DAE go wrong?

You would think digital audio data should be easy to read from a CD, but that is not always the case.  For the most part this is true, but when things go wrong bad-things-happen(tm) and (worse) usually do so without warning.  Before certain tools were available, you had to listen to each rip to be sure it was good (argh!).

Audio data on CDs are protected by standard error correction algorithms which help, but they are far from perfect.  There are actually two levels of error protection (referred to as C1 and C2).  Unfortunately, not all CDROM drives implement all levels of error reporting.  Even worse, some drives simply never report any error (ever).  Finally, it is up to whatever ripping software that you use to use the correct mode when performing DAE (to enable error reporting).

What this all means is that you can get bursts of errors when ripping your CDs that sound remarkably like clicks or skips that you used to get with records!

It doesn't even matter if your CDs appear in perfect condition (as almost all of mine were).  While fingerprints, scratches, peanut butter, etc. don't help, you can get errors on cosmetically perfect CDs.  Errors could be the result of the manufacturing of the CD or even age (also known as Disc Rot).  This may be a bit alarmist, but my advice is to rip your CDs now.... while you still can!

So, how do you get DAE right?

There are four steps:
  1. Use software that supports "secure mode" ripping - like EAC or dbpoweramp
  2. Properly configure your software for your drive (I'll show you how for EAC)
  3. Use a good quality CDROM that reports errors correctly (check reviews and/or this database)
  4. Use AccurateRip - (included with EAC and dbpoweramp).
Even if you do everything right (and your CDs are perfectly clean), it is still possible to get errors on your rip and not know it (frustrating isn't it?)

This is why the AccurateRip database is so useful.  AccurateRip allows you to compare the checksums of your rips to others.  If your checksums match, chances are really high that you have a perfect rip.  Pretty brilliant solution if you ask me.

Ok, what about compression?

Although my original problem was with the DAE process, it is also worth talking about which compression to use.  I had spent quite a bit of time going through forums (e.g. hydrogen audio and others) trying to figure out the most transparent lossy encoder to use.  There are many choices:  MP3, AAC, OGG, WMA, etc.  Furthermore, you need to decide on the bit rate and other options that affect quality.  For example, if you are listening at home on a good stereo you might want a high bit rate (e.g. 200Kbits/second or more).  For portable devices you might want a lower rate so you can get more albums (e.g. 128Kbits/second).

Of course, what this means is there is no one right answer for everyone.  Once disks got big enough it was clear to me that lossless was the only way to go.  A CD compressed using (lossless) FLAC usually takes up about 300MB.  Of course, the same CD encoded as MP3s might only take up 70MB or less.   Sounds like a big difference, but not in the age of Terabyte disks (not unless you have a collection the size of Sony Music).

So, skip the whole "which lossy format to use debate", and use your favorite lossless format (there are a few).   I personally like FLAC because of it's open license and relatively widespread support (for a non-vendor specific format).  FLAC also has internal checksums that verify the integrity of the file (unlike MP3s).  If you ever have a damaged MP3 file you might not ever know about it.  FLAC will tell you if the file is damaged (and this actually happened to me with a bad disk of mine!).

Lossless music can be easily transcoded into another lossless format (without loss, obviously).   This makes your music collection "futureproof".  The same is not true of lossy formats.  If you transcode an MP3 into an AAC, you will always lose quality (sometimes badly).

Personally, I think of lossy music formats as disposable.  I always keep my originals stored losslessly and transcode on the fly for whatever application (and quality level) I might need.  I then delete them when I'm done.  This works for me as I don't need my entire collection on my iPhone at any one time.

Lossless compression preserves HDCD

HDCD (High Definition Compatible Digital) is a process used on some CDs that encodes extra information in the least significant bit of audio data.  This information improves the quality of audio, but requires a CD player capable of decoding these bits.  Also, certain DACs have this ability.  If you encode your CDs with any lossy format (MP3, etc), the HDCD information will be destroyed.

HDCD is not uncommon and while there is a HDCD logo on some CD cases, some CDs are not marked even though they use it.  I have almost a hundred CDs that are encoded with HDCD.

Even if you don't have a DAC that can decode HDCD, you can use software decoders that perform most of the HDCD decoding.  I'll show you how you can do this on-the-fly with SqueezeCenter or when you transcode.  I actually use the same method for dealing with de-emphasis (see below) as well.  This will allow you to get as much quality out of your CDs as possible.

Okay, we are done, just use FLAC right?

Almost, and for most people the answer is yes.  For the remaining purists, there is another format variant that we should consider.   The format is called a Cue Sheet.

Just because there is no loss in the audio data doesn't mean there is no loss (remember my lesson above).  CDs contain more than just audio data.  Specifically, there is track information in the form of indexes within a track as well as flags associated with each track.

Index marks represent the equivalent of tracks (within) tracks.  They are used in some CDs to represent places you can seek to within the track.  I've seen these used in classical music or other CDs that have particularly long tracks.   Yes, they are rare.  I don't know of any computer player that supports them.  There is also an index mark before each track that represents the pre-gap.  This is the bit of silence before each track starts.  When you break a CD into tracks, this gap is usually appended to the beginning of the track (a lossy process).  If you ever wanted to recreate your original CD, you will want to preserve the index information.  A cue sheet is the only way that I know to do this.

Also, there are flags associated with each track.  The only one of interest is called PRE for pre-emphasis.  In the early days of CDs, emphasis was applied to some recordings that boosted their treble.  CDs with pre-emphasis are rare, but do exist.  CD players are required to recognize this emphasis and apply de-emphasis (the opposite process) so that the recording sounds normal.  If your ripping/playback process doesn't take this into account, the CD will sound shrill.  Again, cue sheets provide a way to record this flag.  However, to preserve the most quality, I defer the de-emphasis step until later.

Note:  In the picture above, the Max Roach's CD "We Insist!", has emphasis applied!

So, what is a cue sheet?

Cue sheets are simply text files that represent all non-audio information about the CD.  They take the place of tags.  Associated with a Cue sheet is a single audio file (without tags).   The audio file represents the CD and is not split into tracks.   This wiki article goes into detail and provides an example.

So, cue sheets (with the audio data compressed with FLAC), represent IMO the best way to archive your CDs.   This allows you to recreate your original CD (exactly, if you so choose).

Furthermore, Cue/flac pairs are playable directly in several common players.  However, many players will not recognize them, so they aren't nearly as convenient as other formats.

There is one other really nice feature of Cue Sheets.  Since they are simple text files, you don't need a tag editor  to correct/update any of the album information.  Use Notepad or any other text editor.  Also, it is really nice to have an entire albums' tag information in one place rather than spread across 10 or more tracks (less error-prone if you ask me).

So, you have a choice:
  1. Use FLAC and split your CDs into tracks (most convenient & popular)
  2. Use Cue sheets and FLAC (best preservation and quality, while being still usable)
I choose cue/flac.  Besides, I can always convert from a Cue Sheet to individual tracks if I ever need/want to.

Organizing your collection!

Finally, once your ready to rip your entire music collection you should probably spend some time deciding how you want it organized.

Obviously, you should make sure your music is appropriately tagged (artist name, album name, track name, genre).  You should probably consider what to do if you have a compilation album (make sure this is supported by whatever software/method you use).

There is also the issue of which music genre to use.  This is highly personal and sometimes totally arbitrary.  Is the album considered "Pop", "Rock" or "Modern Rock"?  Since these debates are a bit silly, I chose to use very general labels and don't classify my music too finely.   Anything close to Rock is under the Rock Genre.  Sure it would be great to have more categories, but the software I use right now doesn't support multiple Genres.

Also, consider how the music is stored on your hard disk.  I like to use a simple hierarchy that makes it easy to find my music via computer browser.  At the top level, I use Genre.  Under each genre are artists and under each artist are their albums.  It is simple, but not perfect.  Regardless of what method you use, you playback software should have a database that allows you to find your music in just about any way you wish (which is why tags are so important).

Lastly, don't forget to add album art.  Nothing adds that last bit of polish to your music browsing experience than to see the album art.   I personally use Google image search and do this manually.  Automated tools exist to do this, but I don't see how they could possibly get it right so i do it myself.

Hey, aren't CD's obsolete?

Why not buy music from Internet directly:  iTunes, etc?  While it is true that downloaded music is more convenient (no ripping needed), you are often getting an inferior product (and for more money!).

Unfortunately, the downloadable audio formats that are (mostly) being offered today are of lesser quality to CDs.   There are high resolution downloadable files available, but they are exceedingly rare (and likely to stay that way).  Also, there is an (annoying, IMO) trend to offer music by the track instead of by the album.  Call me old school, but I like it when artists put out complete works that go together.  I'm mostly not interested in hits from one-hit wonders anyway.

Since CDs contain uncompressed audio they are really flexible.  You can transcode them into any format you like with ease.  For example, I have an iPhone that I like to listen to music on.  For the iPhone, iTouch, iPod products the AAC format works really well at reasonably low bit rates.  This means I can fit more music on my iPhone at higher quality with AAC files than with MP3s.   With CDs, this is no problem, I just transcode my files to AAC whenever I want.  If I had bought my music as MP3s, converting them to anything (once they are in a lossy format), just isn't a good idea.   From a purists' point of view, you are stuck if you choose a lossy format.

Also, who knows what new format will become popular in 10 or 15 years from now?  Perhaps Google or some new startup will come up with a killer format that is 1/2 the size AACs now.  With my CDs, I never have to worry.  I also appreciate the fact that there is no vendor lock-in.

Finally, much of the music that I like (Jazz, Blues, etc) has taken years to finally get released on CD.  A lot of it still hasn't been released (you might even be lucky to find it on records).  It really isn't surprising that music companies are only interested in releasing things that sell millions of copies.   They are businesses after all.

Given the choice, I'll be buying CDs as long as I can find them.  If I can't find them new, I'll be searching for used copies.   It won't be long until CDs are gone and I'll take advantage of the opportunity while I can.

Why not records then?  I used to have lots of them, but they are just too fussy.  I know I can digitize them, but that is a whole new level of challenge.  Even obsessive compulsive people have their limits!  Besides, I really didn't have many records that worth the effort.  However, I know many other people do.

Congratulations, you made it!

Hopefully the above gives you an idea of why I made certain choices when setting up my audio system.  The remaining posts in this guide will just cover the details of getting things done.

Welcome to my world :)

5 comments:

  1. I don't agree about genre at the top level. I actually blogged specifically on this topic: Don't organise music files by genre

    Other than that, nice post and nice blog btw, when I get the chance I'll link back to you.

    ReplyDelete
  2. Lots of very useful info here. Thanks! More on music management software, file structure and tagging would be most hepful

    ReplyDelete
  3. Well written and well organised.
    Thank you.

    ReplyDelete
  4. EAC will now help you locate album art but leaves it in the .wav directory. I often forget to move it and then have to go out and find a new copy as I discover folder.jpg has been overwritten by the album art for the next rip. If I knew how to write PERL script I would add a routine to move folder.jpg to the album directory... (please?)

    ReplyDelete
  5. @Anonymous: Thanks for the tip about the new EAC version. What a great feature!, I will definitely enjoy using EAC even more now.

    I've updated my wav-2-cue-flac.pl script to check for an image named "folder" and move it into the appropriate directory. The link in my blog post has already been updated. Just download a new copy.

    ReplyDelete

Copyright (c) 2010 CuttingTheBills.com. All Rights Reserved.