MacEdition : BOLTS : May 9, 2003

Sound-off, Part 2

May 9, 2003

Feedback Farm

Have something to say about this article? Let us know below and your post might be the Post of the Month! Please read our Official Rules and Sponsor List.

Forums

Want to dig even deeper? Post to the new MacEdition Forums!

Last time, we took a look at some of the many audio toolkits available on Mac OS X. This time around it’s time to actually play with a couple of them, and do some stuff like playing prerecorded sounds, playing MP3s, messing with the system volume and doing some simple recording and playback.

The source code and projects (along with screen shots, for what they’re worth) can be found at http://borkware.com/rants/sound/ .

Cocoa me, baby

The first stop along the line is playing with Cocoa’s NSSound class. CocoaSound will look for named sounds and let you choose one and play it, and also will load sounds from files or from URLs.

First off, CocoaSound walks through /System/Library/Sounds looking for system-supplied sound files whose types match the extensions found in -[NSSound soundUnfilteredFileTypes]. It then sticks the sound names into a popup menu. Later on, when the user asks to load a sound, it simply gets the name from the popup and makes a new NSSound object:

NSString *name;

name = [soundNamesPopup titleOfSelectedItem];

NSSound *sound;

sound = [NSSound soundNamed: name];

If you happen to have a sound file that NSSound supports, like a WAV or AIFF file, you can load that sound (in this case, in the NSOpenPanel sheet didEndSelector):

NSString *filename;

filename = [[chooser filenames] objectAtIndex: 0];

NSSound *sound sound = [[NSSound alloc] initWithContentsOfFile: filename byReference: NO];

Or if you have a sound out on the Net somewhere, you can use NSSound’s URL loading features. Create an NSURL first, then give that to NSSound:

NSURL *youllUrl;

youllUrl = [NSURL URLWithString: @"http://borkware.com/rants/sound/torgo-short.aiff"];

NSSound *sound;

sound = [[NSSound alloc] initWithContentsOfURL: youllUrl byReference: NO];

The byReference parameter is there in case you decide to archive the sound (see the docs on archiving and serialization for more on archiving). If you pass YES, only the sound name is stored when you archive an NSSound. If you pass NO, the actual sound data will be stored during archiving. If you're just loading and playing sounds, there's no difference what value you supply there.

QuickTime and MP3

The formats supported by NSSound tend to be pretty large file sizes when you have longer sounds. If you’re using larger sampled sounds, you’ll probably want to keep them in a more compact representation, like MP3. Luckily, QuickTime supports MP3, so we can use that for playing sounds. The QTmp3 sample lets you load an MP3, play it and display a progress bar to show you how far into the track it’s playing.

Cocoa supplies two classes for dealing with QuickTime movies, NSMovie and NSMovieView, but for playing MP3s you just need NSMovie. To load an MP3 file given a path, you do something like:

NSMovie *qtmovie;

qtmovie = [[NSMovie alloc] initWithURL: [NSURL fileURLWithPath: filename] byReference: NO];

Playing the sound is easy; you get the QuickTime movie handle from NSMovie and then use the QuickTime API. In this case, to rewind the track and start playing from the beginning, you do:

GoToBeginningOfMovie ([qtmovie QTMovie]);

StartMovie ([qtmovie QTMovie]);

... and if you want to stop or pause the playback, use:

StopMovie ([qtmovie QTMovie]);

One downside to using QuickTime is that it has a heavy old-school cooperative multitasking flavor. To get your QuickTime content to play, you need to call MoviesTask() repeatedly. When you call it in a tight loop (as a lot of Apple sample code does), it’s really easy to figure out when the track ends, as well as to know where you are in the track. Luckily, the Cocoa run loop calls MoviesTask() for us, so QuickTime playback scheduling happens automatically. But that does mean it’s harder to figure out where you are during playback. You can figure this out by polling. Set an NSTimer to go off a couple of times a second, and use:

TimeValue timevalue;

timevalue = GetMovieTime ([qtmovie QTMovie], NULL);

... to see how far you are into the movie. The maximum value that the timevalue will have is returned by GetMovieDuration([qtmovie QTMovie]). If the movie has finished, IsMovieDone([qtmovie QTMovie]) will return a true value. If you want to have a progress indicator show how far you’ve come, do something like this when you start playing:

[progressIndicator setMinValue: 0.0];

[progressIndicator setMaxValue: GetMovieDuration([qtmovie QTMovie])];

GoToBeginningOfMovie ([qtmovie QTMovie]);

StartMovie ([qtmovie QTMovie]);

... and in your timer handler:

     if (IsMovieDone([qtmovie QTMovie])) {
         // update whatever UI you have
         // stop the timer
         [timer invalidate];
         timer = nil;
     }
     // in all cases (done and not done), set the progress indicator
     [progressIndicator
          setDoubleValue: GetMovieTime ([qtmovie QTMovie], NULL)];

Total volumation, dude

Resources

developer.apple.com/audio/ : Apple's audio developer page - news and technical resources

developer.apple.com/audio/coreaudio.html : Core Audio architecture page

aldebaran.armory.com/~zenomt/macosx/MTCoreAudio/ : MTCoreAudio home

www.mat.ucsb.edu:8000/CoreAudio : CoreAudio and CoreMIDI Swiki - a Wiki dedicated to CoreAudio

developer.apple.com/techpubs/quicktime/ : QuickTime developer’s documentation

borkware.com/rants/sound/ : sample code for these samples

cocoadevcentral.com/articles/000042.php : Cocoa DevCentral article on embedding frameworks into the App Bundle, which is useful if you’re shipping a program that uses an auxiliary framework like MTCoreAudio.

If you’re wanting to bang the audio hardware at a lower level, the MTCoreAudio framework is a very nice wrapper around Apple’s CoreAudio technologies. The last two samples here use this framework to do their work.

The first one, Volumator, lets you change the volume of the left and right channels of the default output device. If you change the volume elsewhere, like with the sound keys on the keyboard, the display will update itself accordingly. Also, if you play with the Balance control in the Sound Preferences, you can see the volume sliders move around to reflect the settings.

MTCoreAudioDevice has a method to give you the default output device being used (there’s a corresponding input device that we’ll look at in a bit):

outputDevice = [MTCoreAudioDevice defaultOutputDevice];

[outputDevice retain];

We can tell the device to change its volume:

[outputDevice setVolume: [volume1Slider floatValue] forChannel: 1 forDirection: kMTCoreAudioDevicePlaybackDirection];

A device can have both recording and playback features. Here we’re telling this output device (speakers or headphones) to set the volume (a float value from 0.0 to 1.0) for the first channel (left) for playback (uh, playback).

If you set the delegate for your device, you can (amongst other things) get notified when the system volume changes:

     [outputDevice setDelegate: self];

     - (void) audioDeviceVolumeDidChange: (MTCoreAudioDevice *) theDevice
                              forChannel: (UInt32) theChannel
                            forDirection: (MTCoreAudioDirection)
theDirection
     {
         [self setStuffBasedOnVolume];

     } // audioDeviceVolumeDidChange

And how do you actually get the current volume information? The framework supplies MTCoreAudioVolumeInfo, a structure with information on the current volume, the muted state and whether it even has a volume or mute control. This updates the UI based on the new volume values:

- (void) setStuffBasedOnVolume
{
     MTCoreAudioVolumeInfo volumeInfo;

     volumeInfo = [outputDevice
                       volumeInfoForChannel: 1
                       forDirection:
kMTCoreAudioDevicePlaybackDirection];
     [volume1Slider setFloatValue: volumeInfo.theVolume];

     // ... do similar stuff for channel 2 (right)

} // setStuffBasedOnVolume

Recording

Many modern Macs, like the PowerBooks and some iLamps, come with a built-in microphone. Sampling sounds from the microphone can be a cool feature for your app, like attaching voice snippets to email or swearing at your friend on the other end of a networked game. The sound quality you get from the built-in microphone is pretty poor, so it’s best used for things like voice annotations or chat. Don’t plan on using that microphone to record your acoustic new-age music CD. If you require higher quality sound input, the Griffin iMic is a pretty nice alternative for machines without analogue audio-in.

Expanding on the stuff that Volumator does, you can add a callback. When you want to start recording or playback, you send the audio device the deviceStart method, and your callback will be called repeatedly to consume or provide data:

     MTCoreAudioDevice *inputDevice;
     inputDevice = [MTCoreAudioDevice defaultInputDevice];
     [inputDevice retain];

     // .. set callback

     // start recording
     [inputDevice deviceStart];

To set the callback, use setIOTarget (don’t be afraid of the selector used – it’s not as bad as it looks):

     [inputDevice setIOTarget: self
                  withSelector: @selector(readCycleForDevice:timeStamp:
                                          inputData:inputTime:outputData:
                                          outputTime:clientData:)
                  withClientData: NULL];

The client data is a rock you can hide data under. The selector is for a method with a signature like this:

- (OSStatus) readCycleForDevice: (MTCoreAudioDevice *) theDevice
                       timeStamp: (const AudioTimeStamp *) now
                       inputData: (const AudioBufferList *) inputData
                       inputTime: (const AudioTimeStamp *) inputTime
                      outputData: (AudioBufferList *) outputData
                      outputTime: (const AudioTimeStamp *) outputTime
                      clientData: (void *) clientData

There’s a bunch of parameters, like time stamps (which are passed by address because they can change while your code is executing in this method), and buffers for incoming and outgoing data. This same signature is used for both data input (recording) and data output (playback), which is why all parameters are declared, even though typically you'll use less than half of them. For plain old recording without any additional processing, the inputData parameter is the interesting part.

The AudioBufferList data structure looks like this:


typedef struct AudioBufferList {
     UInt32          mNumberBuffers;
     AudioBuffer     mBuffers[1];
} AudioBufferList;

Multiple buffers can be passed in. You can index them off of mBuffers using the C trick of a single-element array providing a little syntactic sugar that lets you do the indexing. So if buffer->mNumberBuffers says you have four buffers, you can look at the last buffer’s data by using buffer->mBuffers[3].

The AudioBuffer looks like this:

typedef struct AudioBuffer {
     UInt32  mNumberChannels; // the number of interleaved channels
     UInt32  mDataByteSize;   // size of the buffer data
     void*   mData;           // pointer to the data buffer
} AudioBuffer;

So to get at the data in the first buffer and accumulate it into your own recording buffer, you’d do something like this:

     const AudioBuffer *buffer;
     buffer = &inputData->mBuffers[0];

     memcpy (// the address in our recording buffer to write to
             buffer->mData,
             buffer->mDataByteSize);
     // also update our current location in the recording buffer

Audio devices have two sides – the physical side which deals with the details of the audio hardware, and the logical side which presents a uniform view of audio data. The stuff here just deals with the logical side, which is a run of 32-bit floating point values. The drivers do the hard work of converting to and from this format. If you want to change things like the sample rate, there are methods in the framework to let you change that. Here we’re just blindly slurping bytes and then sending them out to the playback device.

On the playback side, we copy bytes from our buffer into the buffer given to our IO target method:

     AudioBuffer *buffer;
     buffer = &outputData->mBuffers[0];

     memcpy (buffer->mData,
             // current location in our playback buffer,
             buffer->mDataByteSize);
     // also update our current location in the playback buffer.

There’s one gotcha in working inside of the IO target methods. The environment you’re in is called at IO time, and there’s no autorelease pool provided by the IO target dispatcher. Beware of doing too much work here, especially things like object creation and memory allocation to avoid gaps in recorded data. I’ve also had problems sending the deviceStop message when playback finishes. To talk back to your Cocoa user interface, use NSObject’s performSelectorOnMainThread: to schedule a method to be called the next time the application’s main thread is run. So when you detect that you're done playing, do something like this:


     if (areWeDone?) {
         // yep.  tell the UI part to shut down the playback.
         [self performSelectorOnMainThread: @selector(stopPlaying:)
               withObject: self
               waitUntilDone: NO];
     } else {
         // otherwise stick some data into the buffer
     }

That’s it for this month. Making your computer make noise is a lot of fun, and can be pretty easy to do, so go out there and get noisy!

Mark Dalrymple (markd@borkware.com) has been wrangling Mac and Unix systems for entirely too many years. In addition to random consulting and custom app development at Borkware, he also teaches the Core Mac OS X and Unix Programming class for the Big Nerd Ranch.

E-mail this story to a friend

Sound-off, Part 2

Feedback Farm

Forums

Cocoa me, baby

QuickTime and MP3

Total volumation, dude

Resources

Recording

Advertise on MacEdition!

Write for MacEdition!

Ask MacEdition!

Talkback on this story!