Audio

Encoding a CD Collection to MP3

by Michael Schmidt - Oct 15, 2003

I got asked by a friend how I converted my music collection to MP3. What at first seems like a pretty obvious question turns into a nightmare once you try to squeeze some wisdom out of google. Encoding MP3s is a science. There are probably a gazillion different opinions out there on how to do it right. Half of them is just plain wrong, totally outdated or some urban legend that sparked in the marketing propaganda of the early encoder manufacturers. The other half is usable information but you have to piece it together from here and there. Also there is not really just one way to do it right. It depends a lot on your personal preference, your favorite music genre and the way you want to handle your digital music archive. in this article I will explain how I digitized my CD collection.

The Scenario

What we’re looking for is a way to store somebody’s 250 CD collection on a regular PC. We’re not audiophile but the playback should be transparent. In other words we don’t want to hear a difference between CD and MP3. It might be possible that we want to carry our music around, be it on a portable MP3-Player or on convenient MP3-CDs for easy playback in the car. Another requirement is that we want to be able to play our collection or parts of it in shuffle mode. Shouldn’t be a problem but it is… This seems to me like a pretty likely scenario.

The Art of Media Warfare

Before I start I’d like to emphasize that I don’t really know if it is legal for you to digitize CDs even though you own them. If you ask me it should be but who am I to decide. This very much depends on where you live. If you’re concerned that some music industry inquisitors might bust your ass you better check up on your local laws and be a subservient consumer.

If you’re living in the US you might have heard of the 1983 Betamax decision in which the Supreme Court established your right to make copies of legally purchased copyrighted material for the purpose of “fair use”. This includes the making of personal backup copies or multiple copies for different media devices. So there shouldn’t be a problem, right? Wrong! In 1998 the [offsite]DMCA was enacted with the unselfish support of the recording industry, motion picture studios and book publishers which makes it illegal to make copies of digital material for any purpose. For now your only perfectly legal way to get MP3s of your CDs is to buy them again in MP3. There might be some hope in Boucher’s [offsite]Fair Use Rights Bill which might get passed in the next decade or so if you’re lucky.

Also in Germany a new law was recently introduced (a European follow up to the DMCA. Thanks again.) and now it is illegal to circumvent effective copy protections. This is of course a paradox since a copy protection mechanism that can be circumvented isn’t effective per definition. But that’s a story for another day… As a result a lot of new music CDs now come with copy protection. You might want to stay away from those wannabe “CDs”.

If you ask me the record industry can’t and won’t enforce those laws for personal “fair use”. Actually, it would be quite easy to spot the copyright criminals. Just look for those white iPod headphones. But imagine the public uproar if they start to confiscate MP3 players…

Codec Choices

After this short excursion into modern methods of alienating your customers, let’s decide which codec we want to use for our digital music library. There is a whole bunch of audio codecs out there. They can be categorized as lossy and lossless codecs. All of the popular codecs are lossy codecs. Their advantage is that they offer much better compression than lossless codecs. The drawback is that they discard information which might be heard or felt. So audiophile users tend to build their music library with lossless codecs or even uncompressed in wav format. The downside to this approach is the poor file-size reduction (1:2) compared to lossy codecs (1:10). For 250 CDs you might need 12GB with a lossy codec compared to 60GB with a lossless codec. With today’s hard disc sizes this might still be an option for you if you don’t want to conveniently carry and copy your music to a portable medium.

Obviously I already decided that it’s going to be MP3. My reasons for that are mainly the wide support for MP3, sufficient compression ratios and playback quality.

Another interessting codec would be the ogg format. Over MP3 it offers better compression ratios at the same quality. It isn’t nearly as widely supported as MP3 and as of today there are only a few portable ogg players. That might change in the future but for now it just doesn’t really meet my requirements.

Bitrate? Quality?

There’s a huge debate, no actually it’s a war, going on over whether CD-quality can be achieved with MP3 regardless of bitrate settings. What it boils down to is that you can’t get CD-quality out of MP3. But - 99.9% of all people cannot hear a difference between a CD and a properly encoded MP3.

What you will most commonly see and what is even advertised in some programs as CD-quality is the 128kbit/s setting. This is by all means not CD quality. A careful listener with halfway decent equipment won’t have any problems telling CD and MP3 apart.

Now, what is a properly encoded MP3? First of all you have to use the right encoder. To make a long story short just use [offsite]LAME as encoder with Alt Preset Standard settings. It is an open source encoder which can be tweaked by literaly a million of command line switches. Fortunately you don’t have to know anything about that because there’s also a set of recommended presets. These presets were tediously tuned by audio freaks:

  • —alt-preset standard (~190 kbit/s, typical 180 … 220)
  • —alt-preset fast standard (~190 kbit/s, faster but potentially lower quality)
  • —alt-preset extreme (~250 kbit/s, typical 220 … 270)
  • —alt-preset fast extreme (~250 kbit/s, faster but potentially lower quality)
  • —alt-preset insane (320 kbit/s CBR, highest possible quality)

With the standard setting the encoded MP3s will sound transparent to 99% of all people. That’s also what I chose for my collection. Of course I played with the idea to use a higher setting but I decided that I won’t ever notice the difference anyway. To hear a difference in that league you have to open yet another can of worms and think about high quality and expensive audio equipment. And people with audio equipment more expensive than a car won’t use MP3s anyway.

The standard setting also uses a variable bit rate technique (VBR) which will adjust the compression ratio to the complexity of the music. So complex sections of music are encoded at higher bit rates while simpler sections are given a lower bit rate. This will usually average to a bit rate of 192kbit/s. The quality will be much better than an MP3 encoded with a constant bit rate (CBR) of 192kbit/s though.

It also uses a technique called Joint Stereo to further reduce the file size. Common believe is that Joint Stereo sounds fishy and should be avoided. What’s up with that? This might have been true for early Frauenhofer and Xing encoders which were known for their bad joint stereo implementation. Yet it’s definitely false for the LAME encoder. Instead of storing the information for both stereo channels, joint stereo will store only one full channel and the stereo seperation for the original left and right channel. Since usually the left and right channel differ only slightly only those differences have to be stored, reducing the overall size of the MP3 file. During playback both stereo channels will be fully reconstructed. This technique is mathematically lossless and any perceived difference is a construct of imagination.

Naming Schemes, Organization and ID3-tags

A quick scan of my collection revealed that there are on average 12 to 13 songs on each album. That will result in over 3000 songs for 250 CDs. Since each song will be encoded to a single file we’ll end up with quite a bunch of files. After the encoding is done we better be able to tell all those files apart. So let’s spend some thought on how to organize all those files.

MP3 Archive Usually MP3-files get accessed in two ways. You have to handle them manually to organize the files on your harddisk, get them into your (mobile) audio player or copy them on CD. Besides that MP3s get accessed by audio applications which need to display information about your collection or the currently playing song. Each song can be classified by e.g. title, artist, album, year, genre, track number and much more. This information needs to be stored in both machine and human readable format. Obviously the filename and the directory structure can hold this info. But the developer of an audio application can not rely on the users to adhere to specific rules when naming their MP3s. So there needs to be another way to store this info for audio applications to read. For this purpose the MP3 format can hold metadata in ID3 format.

I tried a few different approaches and came up with the following naming scheme. This is also supported by some applications and can often be found as the default setting.

\Artist\Album\Artist - Album - Track - Title.mp3

My MP3 folder contains one folder for each artist. Within the artist folder are the album folders which contain the actual MP3s. With that naming scheme I gain maximum flexibility. I can conviently browse through the artists or get all albums of one artist. The redundant filename makes sure that I keep all information about a song even if it gets detached from the directory structure.

The ID3-tags usually get imbedded automatically. So there’s not much to worry about. You don’t even have to enter all that info for yourself. There’s an online database called [offsite]freedb which holds info about virtually every audio CD out there. There is a small problem though. There are two different versions of ID3-tags.

  • ID3v1 is the older version and is supported by every player. The problem is that it only holds the data I mentioned above. Usually that won’t be much of a problem but each field can only store 30 characters. While this is usually enough for the artist and album name, song titles will quite often get truncated.
  • ID3v2 tries to solve this problem. It can now hold a whole plethora of information including a lot of fields, synched lyrics, pictures and what have you. Unfortunately it got a little over-engineered and so complete and up-to-date implementations are a bit rare. Most players will read the common fields though.

So I usually just save the same data in both formats. This way audio applications which understand ID3v2 will get the whole info while older players will display something useful which might occasionally be truncated.

Normalization and mp3gain


Winamp and AlbumListIn the past few years record labels started to embrace a new trend - Louder is Better. For a CD to sound professional, to stand out (against the mainstream crap) it has to be loud. I don’t want to explain why this is happening but if you’re interested read what Rip Rownan has to say in his article [offsite]Over The Limit. Now the problem with that is that every other album has a different volume and you have to constantly fiddle with the volume controls. Back in the old days you could adjust the volume while switching CDs but now you probably want to enqueue days of music in your playlist and the constant volume changes soon get annoying. This gets even more obvious if you play your music on shuffle. Every other song will have a different volume.

The solution to this problem is a tool called [offsite]mp3gain. It utilzes an algorithm called [offsite]ReplayGain which calculates how loud the file will actually sound to a human ear. The difference to many other normalization tools is that the MP3 doesn’t have to be re-encoded. In other words: there will be no quality loss. The required replay gain is then stored directly within the MP3 and can be reverted or changed again later.

There are two modes of operation: Track and Album. The first one will analyse each track individually and make all songs sound equally loud. This is what is used for an unrelated mix of songs or if you only want to play your collection in shuffle mode. Album mode allows you to correct the volume for each track but keep the relative differences in volume for the whole album.

I recommend you use album mode for ripped audio CDs all the time. I noticed that even in shuffle mode the volume differences between songs are subtle. You might hear it occasionally but you also keep the possibility of hearing the intended differences when listening to whole albums.

Workflow Explained


In this last paragraph I will guide you through the actual process of ripping a CD. I will introduce you to the applications and tools I found to be most useful. All of the recommended tools are open source, freeware or in the worst case postcard-ware. Note that some of them are only available for Windows.
  1. Ripping

    I had to try a lot of different audio ripping software before I found [offsite]Exact Audio Copy (EAC). This ripper is near to perfect in my book. Though it worked flawlessly for me I heard that it’s sometimes tricky to install. There is a very useful [offsite]FAQ and a very responsive [offsite]online community which will surely get you through any problems that might occur.

    Once you’ve gone through the setup wizard and configured the filename naming scheme (if you want to use my proposed format insert: %A - %C - %N - %T) extracting a CD is as easy as putting it in your drive, getting information from [offsite]freedb and hitting the “CD to Wav”-icon. If you want to convert a whole bunch of CDs I recommend that you extract them to wav first before you start the encoding. That way you only have to attend the ripping process while the encoding can take place later when the operator is in sleep mode…

    I also recommend that you test a CD with each of your CD-drives and listen to the result before you start the endeavor of ripping your whole collection. This way you will be able to choose your fastest and most reliable drive.

  2. Enconding

    As said above I recommend using [offsite]LAME as encoder. It integrates nicely in EAC and can be activated with a single click. There’s a small problem though. You can only get source code on the official LAME website. Precompiled binaries are very easy to find though on other sites. The EAC configuration wizard will tell you where to find and how to install it. Set the quality settings to high and you’re ready to go.

  3. Normalization

    Download and install [offsite]mp3gain. Point it to the folders were the newly encoded MP3s reside and start the album analysis. Make sure that every album has its own subfolder. After analyzing the data hit Album Gain using a peak level of 89db.

  4. Tags

    Now you can set up EAC to also write the ID3-Tags for you in case you need to edit them. If you are satisfied with the results you can skip this step. Otherwise check out [offsite]ID3-TagIt. It’s an application for editing, adding, or deleting ID3-Tags in MP3 files. There are two versions available and if you don’t have Microsoft’s .net-framework installed you might just as well use the older version.

    Just to make sure I usually delete all tags first. Then I rebuild both ID3v1 and ID3v2 tags from the filenames. I do this because after the whole ripping and encoding process there’s a lot of unnecessary clutter in the tags. ID3-TagIt is also very helpful in correcting spelling errors or inconsistencies.

  5. Playback

    I swear by [offsite]Winamp 2.9x for playback but this is a matter of preferences. There’s also a whole bunch of other players like [offsite]Foobar2000 and your new MP3s should work with all of them. If you combine Winamp with a plugin called [offsite]Albumlist you have to look really hard to find another player which even comes close in terms of elegancy and ease of use to this combo (on Windows systems at least).

Lead Out

I hope this article was informative and helpful. I am aware that the workflow section only scratches the surface and isn’t really a step by step instruction. But the background information and the tools I recommended should point you in the right direction. If you still have specific problems I urge you to consult the [offsite]Hydrogen Audio Forums which are an endless resource for all your digital audio questions.

If you have any remarks, additional insights or corrections please [onsite]let me know or leave a comment.

Michael is a geek. He's currently freezing in Berlin, Germany.

Trackbacks

TrackBack is a system for enabling "conversations between weblogs". The links listed below are links to posts in other weblogs that reference this article:

To manually trackback this article use the following URL:

http://www.brain-dump.com/trackback/2

Comments

#1 | 09:27 on 19 October 2003 | by NoOneElse

I like this article, in particular because it mirrors my preferences :-) Personally, I use the Ogg Vorbis codec, as I only use PCs for playing my music. Furthermore, it gives me smaller files at the same quality (not that I should be concerned about quality with my cheap earphones). Thus, I was able to just squeeze all of my CDs onto one DVD and stay under the baggage weight limit on my way to Oz. When space is really tight, the lowest quality setting is still perfectly acceptable for my live-music tortured ears (in contrast to MP3 where 64Kb/s makes you weep).

All the steps in my conversion process are done by one program (grip, which supports a variety of output formats) on Linux, so there is really no loss in convenience compared to Windows. The mother of all Linux players, XMMS, is very similar to WinAmp and together with Linux scripting allows you to do cool coding stunts for websites etc.

#2 | 03:34 on 20 October 2003 | by NoOneElse

Some more musings on the naming scheme:

CD-ROM and DVD file systems often have problems with long path names (>70 characters or so). So if you want to archive your songs on this kind of media, here are some alternatives:
* Make sure that you use tags, only they can persistently capture the complete information about a file.
* Use the scheme // - .mp3, it has all the info but does not duplicate the artist and album names.
* If that is still not short enough, go with - .mp3 or anything else that does not produce name clashes. Both WinAmp and XMMS can be configured to sort the playlist by the tag contents rather than the file name. And if you have full tag info in each file, a good tag editor allows you to set up a pattern and rename all your files according to that pattern based on the tag contents.

Another problematic aspect of CD/DVD file systems is non-ASCII (i.e. non-English-alphabet) characters (good bye, Sigur Ròs tracks). Thus, some CD rippers and tag editors support stripping of such characters - again no big deal, as long as they are preserved in the tags.

Sometimes it is useful to transfer audio files via HTTP (most audio players support playback from HTTP sources). Here, naming strikes again. Non-ASCII characters: bad. Blanks: a pain. I hate it to replace blanks in an audio file name with underscores, as supported by many tools. In this particular scenario, however, it comes in handy.

An important conclusion of all this is to really keep as much and accurate data in your audio files as possible. Then you can quite painlessly convert your files to the naming scheme of the day without losing information.

#3 | 19:15 on 25 January 2004 | by David Andreasen

This article is exactly what I need and it's very well written. I would have paid for it. Maybe you should put a PayPal link for donations.

When I'm doing the normalization and I point mp3gain to the folders, can I just point it to the main music folder or do I have to analyze each album folder individually?

And I don't really understand the difference between analyzing by album and analyzing by track. Don't they both use a peak level of 89db?

#4 | 19:51 on 25 January 2004 | by Mikey

David,

you can do the normalization on your main music folder, as long as your albums are in seperate subfolders.

The analyze track feature will normalize each track to 89db. So all tracks will have the same volume. Album normalization however will keep the differences in volume between the tracks of the album.

Imagine a classical CD with very quiet songs. But for effect one song is louder than all the other songs. If you analyze by track all songs will become the same volume. The wakeup effect will be gone now if you listen to the whole album.

If you do album analysis the loud song will become 89db but the other songs will still be quiet at maybe 80db. Listening to the songs individually is a little annoying though because you have to fiddle with the volume to make the quiet songs louder.

All in all I would still recommend to do the album analysis. Usually the volume differences can only be heard in track shuffle mode and then only in very extreme cases...

Mikey

#5 | 09:28 on 13 February 2004 | by Daniel Bond

Thank you, this is beautifully written. I've struggled to find concise information on this topic. You are right about the propaganda. A low point in my life was installing the bloated software suite that came with my Creative SB Live! card, complete with adverts in the ripping program. It was awful in every other respect as well.

I own about 200 CDs and I'm too lazy to encode them, but eventually I will. (Celeron 850MHz, might take a while) Recognizing the time required to encode them tends to make one hesitant to commit to a specific path. I don't want a jumble of formats, plus I have a portable mp3-only player (don't use it too much). For the dozen or so CDs I've ripped so far, I had settled on CDex. I'm thinking of switching to EAC. Both are free. Thank you for the information on volume normalization.

Lately, I've become hooked on Foobar2000 v0.8b7. It really is an example of software done right. The program requires some effort (because it's so simple) to configure, but once over that hurdle, it's really fine. Instructions are minimal to nonexistant, but there is a lively forum where one can glean useful tidbits. I use it to play mp3, CDs, and internet radio. I think it can be set up to rip/encode, but I haven't tried doing that.

Enjoyed the article a lot.

PS: Michael, I would love to see a "braindump" about Internet Radio. I am using shoutcast information obtained in Winamp to set up playlist files for Foobar. It's a little kludgy but maybe it's just my lack of knowledge.

I don't understand the different formats (are they proprietary?) etc.. One thing that's apparent to me is the agenda-driven nature of the various streaming media pimps. I mean, just obtaining a simple spreadsheet-type list of IP addresses for radio stations is a major pain in the neck. I shouldn't gripe; I accept that everybody's trying to make a living.

#6 | 09:42 on 13 February 2004 | by Daniel Bond

Oh, yeah, I forgot to mention. A great program for applying tags to ripped files is called Moosic. It places your list of files in a spreadsheet-like grid and you can manipulate tags based on file names, apply an album name all at once, et cetera. Very visual, very fast. I must have spreadsheets on the brain!

Goofy name, silly looking shell, but a great program.

#7 | 06:47 on 21 March 2004 | by Ed Fisher

Good Article.

On Normalization, I think I saw in EAC that you could do this as you were encoding? I assume you recommend not to do that, and instead MP3Gain after the rip\encode process is done?

Also, do you listen to these MP3's only on PC and Handheld? Or have you linked to stereo, and if so, with what and how?

#8 | 04:39 on 27 May 2004 | by MrHappyGoLucky

Personally, I rarely rip CDs to Mp3, but when I do I like to use EAC and Lame Mp3 encoder. For ID3-Tag editing, I have not been able to find anything better than Mp3Tag, although I am interesting Moosic mentioned above.

I have (had?) an article from Maximum PC on optimal settings for EAC and if I can locate it, I'll give everyone the gist of it on here. Thinking back however, I may have thrown the mag away.

#9 | 22:59 on 23 September 2004 | by cajhin

Thanks for the very well written article - similar to what I've gone through.

@ Ed F.: Normalization and MP3Gain is similar but not the same (N. makes it technically the same volume, MP3Gain makes it _sound_ the same volume; there's a difference).
When you want to keep loudness variations within one album, you need to know what the loudest song is on the CD. For that, you need to rip the whole CD first.

Some things I do different...

- my album folders look like --. Gives me a nice chronological discography for those favorite artists where I have lots of CDs.

- as a seperator I use '--' instead of ' - '. Many songs and albums have titles like "somealbum - the early years" or "somesong - trance mix". Software doesn't know what to make of that.

- I'm willing to use other formats if I get them that way. Too messy to convert everything to mp3, and you lose sound quality with each conversion. ogg, mpc and ape is ok for me. Easy to handle unless I want to use it in the car.

- Mp3tag has grown on me. I use it more than ID3Tag-It, partially because it handles other file and tag formats, too.

- MuzicMan has been a great music library for years; now I use something I programmed myself, mostly because I want to use a web pad as a remote control.

- Connect it to your hifi equipment (preferrably with a digital connection). It's awesome, time to sell the cd player...

#10 | 23:07 on 25 January 2005 | by mixman2112

I just got a 40gb mp3 player and have been testing out the best way to get my CD collection ripped and encoded. I only want to do it once. Your process is pretty much what I've come up with but I have a couple questions. I am interested in doing the encoding as a seperate step overnight but when I try it, the tag info doesn't get into the mp3. I think I understand why -- the info from freedb isn't loaded in EAC at the time of the encode. Is there a way around this? Also, If all WAV files are in there own seperate artist/album folders, how do you get EAC to "look in subfolders" like you can do with CDex so you can encode more than one album at a time? Thanks in advance.

#11 | 04:24 on 14 January 2006 | by Carol

You can speed up tagging with TagTuner. It is not free software but it has awesome Search for Album Information and Rename MP3 files features, and it search for album covers simultaneously with album tags!

TagTuner Main Page:
http://www.tagtuner.com/

Articles:
http://www.tagtuner.com/tagging-mp3-files-faster.php
http://www.tagtuner.com/search-for-album-cover.php

Post A Comment

Please try and keep your comments on-topic, informative and polite. Flaming and trolling is discouraged and may be deleted. In fact, we reserve the right to edit or delete any post for any reason.


If you don't want your email address to show up, put in a URL as well as an email and the URL will go in the author link.



Your comment:

To make line and paragraph breaks, press return (don't use <p> or <br>). Please don't use inline HTML (it gets removed anyway). URLs are automagically converted to proper links.