如何在iOS上使用AVAssetReader正确读取已解码的PCM样本 – 目前解码不正确

我目前正在从事计算机科学学士学位的申请工作。 该应用程序将关联来自iPhone硬件(加速度计,GPS)和正在播放的音乐的数据。

这个项目还处于起步阶段,只有2个月的时间。

我现在正需要帮助的时刻,是从iTunes库中的歌曲中读取PCM样本,并使用audio单元进行播放。 目前我想工作的实现如下:从iTunes中随机select一首歌曲,并在需要时从中读取样本,并存储在缓冲区中,让我们称之为sampleBuffer。 稍后在消费者模型中,audio单元(具有混音器和remoteIO输出)具有callback,我只需将sampleBuffer中所需的样本数复制到callback中指定的缓冲区中。 然后我通过扬声器听到的是不是我所期望的; 我可以认识到,它正在播放歌曲,但它似乎是不正确的解码,它有很多的噪音! 我附加了一个图像,显示了第一〜半秒(24576样本@ 44.1kHz),这不像一个正常的输出。 在我进入清单之前,我已经检查过这个文件没有被破坏,类似的,我写了缓冲区的testing用例(所以我知道缓冲区不会改变这个样本),尽pipe这可能不是最好的方法(有些人会主张走audio队列路线),我想对样本进行各种操作,并在完成之前更改歌曲,重新排列播放什么歌曲等。此外,audio中可能存在一些不正确的设置单位,但是,显示样本(显示样本解码不正确)的graphics是从缓冲区中直接获取的,因此我现在只是在寻找解决为什么从磁盘读取和解码不能正常工作的原因。 现在我只想通过工作来玩游戏。 不能张贴图像,因为新的到stackoverflow所以inheritance人的形象的链接: http : //i.stack.imgur.com/RHjlv.jpg

清单:

这是我设置audioReadSettigns将用于AVAssetReaderAudioMixOutput

// Set the read settings audioReadSettings = [[NSMutableDictionary alloc] init]; [audioReadSettings setValue:[NSNumber numberWithInt:kAudioFormatLinearPCM] forKey:AVFormatIDKey]; [audioReadSettings setValue:[NSNumber numberWithInt:16] forKey:AVLinearPCMBitDepthKey]; [audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsBigEndianKey]; [audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsFloatKey]; [audioReadSettings setValue:[NSNumber numberWithBool:NO] forKey:AVLinearPCMIsNonInterleaved]; [audioReadSettings setValue:[NSNumber numberWithFloat:44100.0] forKey:AVSampleRateKey]; 

现在下面的代码清单是一个接收NSString和歌曲的persistant_id的方法:

 -(BOOL)setNextSongID:(NSString*)persistand_id { assert(persistand_id != nil); MPMediaItem *song = [self getMediaItemForPersistantID:persistand_id]; NSURL *assetUrl = [song valueForProperty:MPMediaItemPropertyAssetURL]; AVURLAsset *songAsset = [AVURLAsset URLAssetWithURL:assetUrl options:[NSDictionary dictionaryWithObject:[NSNumber numberWithBool:YES] forKey:AVURLAssetPreferPreciseDurationAndTimingKey]]; NSError *assetError = nil; assetReader = [[AVAssetReader assetReaderWithAsset:songAsset error:&assetError] retain]; if (assetError) { NSLog(@"error: %@", assetError); return NO; } CMTimeRange timeRange = CMTimeRangeMake(kCMTimeZero, songAsset.duration); [assetReader setTimeRange:timeRange]; track = [[songAsset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0]; assetReaderOutput = [AVAssetReaderAudioMixOutput assetReaderAudioMixOutputWithAudioTracks:[NSArray arrayWithObject:track] audioSettings:audioReadSettings]; if (![assetReader canAddOutput:assetReaderOutput]) { NSLog(@"cant add reader output... die!"); return NO; } [assetReader addOutput:assetReaderOutput]; [assetReader startReading]; // just getting some basic information about the track to print NSArray *formatDesc = ((AVAssetTrack*)[[assetReaderOutput audioTracks] objectAtIndex:0]).formatDescriptions; for (unsigned int i = 0; i < [formatDesc count]; ++i) { CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i]; const CAStreamBasicDescription *asDesc = (CAStreamBasicDescription*)CMAudioFormatDescriptionGetStreamBasicDescription(item); if (asDesc) { // get data numChannels = asDesc->mChannelsPerFrame; sampleRate = asDesc->mSampleRate; asDesc->Print(); } } [self copyEnoughSamplesToBufferForLength:24000]; return YES; } 

下面介绍这个函数 – (void)copyEnoughSamplesToBufferForLength:

 -(void)copyEnoughSamplesToBufferForLength:(UInt32)samples_count { [w_lock lock]; int stillToCopy = 0; if (sampleBuffer->numSamples() < samples_count) { stillToCopy = samples_count; } NSAutoreleasePool *apool = [[NSAutoreleasePool alloc] init]; CMSampleBufferRef sampleBufferRef; SInt16 *dataBuffer = (SInt16*)malloc(8192 * sizeof(SInt16)); int a = 0; while (stillToCopy > 0) { sampleBufferRef = [assetReaderOutput copyNextSampleBuffer]; if (!sampleBufferRef) { // end of song or no more samples return; } CMBlockBufferRef blockBuffer = CMSampleBufferGetDataBuffer(sampleBufferRef); CMItemCount numSamplesInBuffer = CMSampleBufferGetNumSamples(sampleBufferRef); AudioBufferList audioBufferList; CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBufferRef, NULL, &audioBufferList, sizeof(audioBufferList), NULL, NULL, 0, &blockBuffer); int data_length = floorf(numSamplesInBuffer * 1.0f); int j = 0; for (int bufferCount=0; bufferCount < audioBufferList.mNumberBuffers; bufferCount++) { SInt16* samples = (SInt16 *)audioBufferList.mBuffers[bufferCount].mData; for (int i=0; i < numSamplesInBuffer; i++) { dataBuffer[j] = samples[i]; j++; } } CFRelease(sampleBufferRef); sampleBuffer->putSamples(dataBuffer, j); stillToCopy = stillToCopy - data_length; } free(dataBuffer); [w_lock unlock]; [apool release]; } 

现在sampleBuffer将有不正确的解码样本。 谁能帮助我,为什么这样呢? 这发生在我的iTunes库上的不同文件(MP3,AAC,WAV等)。 任何帮助将不胜感激,此外,如果你需要我的代码的任何其他列表,或者可能是什么输出听起来像,我会附上它的每个请求。 在过去的一周里,我一直在试图debugging它,并且没有在网上find任何帮助 – 每个人似乎都是用我的方式去做,但似乎只有我有这个问题。

感谢您的帮助!

彼得

目前,我也正在从事一个项目,涉及从iTunes资源库提取audio样本到AudioUnit。

audiounit呈现callback包括供您参考。 input格式设置为SInt16StereoStreamFormat。

我已经使用了迈克尔·泰森的循环缓冲区实现 – TPCircularBuffer作为缓冲区存储。 很容易使用和理解! 感谢迈克尔!

 - (void) loadBuffer:(NSURL *)assetURL_ { if (nil != self.iPodAssetReader) { [iTunesOperationQueue cancelAllOperations]; [self cleanUpBuffer]; } NSDictionary *outputSettings = [NSDictionary dictionaryWithObjectsAndKeys: [NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey, [NSNumber numberWithFloat:44100.0], AVSampleRateKey, [NSNumber numberWithInt:16], AVLinearPCMBitDepthKey, [NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved, [NSNumber numberWithBool:NO], AVLinearPCMIsFloatKey, [NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey, nil]; AVURLAsset *asset = [AVURLAsset URLAssetWithURL:assetURL_ options:nil]; if (asset == nil) { NSLog(@"asset is not defined!"); return; } NSLog(@"Total Asset Duration: %f", CMTimeGetSeconds(asset.duration)); NSError *assetError = nil; self.iPodAssetReader = [AVAssetReader assetReaderWithAsset:asset error:&assetError]; if (assetError) { NSLog (@"error: %@", assetError); return; } AVAssetReaderOutput *readerOutput = [AVAssetReaderAudioMixOutput assetReaderAudioMixOutputWithAudioTracks:asset.tracks audioSettings:outputSettings]; if (! [iPodAssetReader canAddOutput: readerOutput]) { NSLog (@"can't add reader output... die!"); return; } // add output reader to reader [iPodAssetReader addOutput: readerOutput]; if (! [iPodAssetReader startReading]) { NSLog(@"Unable to start reading!"); return; } // Init circular buffer TPCircularBufferInit(&playbackState.circularBuffer, kTotalBufferSize); __block NSBlockOperation * feediPodBufferOperation = [NSBlockOperation blockOperationWithBlock:^{ while (![feediPodBufferOperation isCancelled] && iPodAssetReader.status != AVAssetReaderStatusCompleted) { if (iPodAssetReader.status == AVAssetReaderStatusReading) { // Check if the available buffer space is enough to hold at least one cycle of the sample data if (kTotalBufferSize - playbackState.circularBuffer.fillCount >= 32768) { CMSampleBufferRef nextBuffer = [readerOutput copyNextSampleBuffer]; if (nextBuffer) { AudioBufferList abl; CMBlockBufferRef blockBuffer; CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(nextBuffer, NULL, &abl, sizeof(abl), NULL, NULL, kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment, &blockBuffer); UInt64 size = CMSampleBufferGetTotalSampleSize(nextBuffer); int bytesCopied = TPCircularBufferProduceBytes(&playbackState.circularBuffer, abl.mBuffers[0].mData, size); if (!playbackState.bufferIsReady && bytesCopied > 0) { playbackState.bufferIsReady = YES; } CFRelease(nextBuffer); CFRelease(blockBuffer); } else { break; } } } } NSLog(@"iPod Buffer Reading Finished"); }]; [iTunesOperationQueue addOperation:feediPodBufferOperation]; } static OSStatus ipodRenderCallback ( void *inRefCon, // A pointer to a struct containing the complete audio data // to play, as well as state information such as the // first sample to play on this invocation of the callback. AudioUnitRenderActionFlags *ioActionFlags, // Unused here. When generating audio, use ioActionFlags to indicate silence // between sounds; for silence, also memset the ioData buffers to 0. const AudioTimeStamp *inTimeStamp, // Unused here. UInt32 inBusNumber, // The mixer unit input bus that is requesting some new // frames of audio data to play. UInt32 inNumberFrames, // The number of frames of audio to provide to the buffer(s) // pointed to by the ioData parameter. AudioBufferList *ioData // On output, the audio data to play. The callback's primary // responsibility is to fill the buffer(s) in the // AudioBufferList. ) { Audio* audioObject = (Audio*)inRefCon; AudioSampleType *outSample = (AudioSampleType *)ioData->mBuffers[0].mData; // Zero-out all the output samples first memset(outSample, 0, inNumberFrames * kUnitSize * 2); if ( audioObject.playingiPod && audioObject.bufferIsReady) { // Pull audio from circular buffer int32_t availableBytes; AudioSampleType *bufferTail = TPCircularBufferTail(&audioObject.circularBuffer, &availableBytes); memcpy(outSample, bufferTail, MIN(availableBytes, inNumberFrames * kUnitSize * 2) ); TPCircularBufferConsume(&audioObject.circularBuffer, MIN(availableBytes, inNumberFrames * kUnitSize * 2) ); audioObject.currentSampleNum += MIN(availableBytes / (kUnitSize * 2), inNumberFrames); if (availableBytes <= inNumberFrames * kUnitSize * 2) { // Buffer is running out or playback is finished audioObject.bufferIsReady = NO; audioObject.playingiPod = NO; audioObject.currentSampleNum = 0; if ([[audioObject delegate] respondsToSelector:@selector(playbackDidFinish)]) { [[audioObject delegate] performSelector:@selector(playbackDidFinish)]; } } } return noErr; } - (void) setupSInt16StereoStreamFormat { // The AudioUnitSampleType data type is the recommended type for sample data in audio // units. This obtains the byte size of the type for use in filling in the ASBD. size_t bytesPerSample = sizeof (AudioSampleType); // Fill the application audio format struct's fields to define a linear PCM, // stereo, noninterleaved stream at the hardware sample rate. SInt16StereoStreamFormat.mFormatID = kAudioFormatLinearPCM; SInt16StereoStreamFormat.mFormatFlags = kAudioFormatFlagsCanonical; SInt16StereoStreamFormat.mBytesPerPacket = 2 * bytesPerSample; // *** kAudioFormatFlagsCanonical <- implicit interleaved data => (left sample + right sample) per Packet SInt16StereoStreamFormat.mFramesPerPacket = 1; SInt16StereoStreamFormat.mBytesPerFrame = SInt16StereoStreamFormat.mBytesPerPacket * SInt16StereoStreamFormat.mFramesPerPacket; SInt16StereoStreamFormat.mChannelsPerFrame = 2; // 2 indicates stereo SInt16StereoStreamFormat.mBitsPerChannel = 8 * bytesPerSample; SInt16StereoStreamFormat.mSampleRate = graphSampleRate; NSLog (@"The stereo stream format for the \"iPod\" mixer input bus:"); [self printASBD: SInt16StereoStreamFormat]; } 

我猜这是晚了,但你可以试试这个库:

https://bitbucket.org/artgillespie/tslibraryimport

使用这个将audio保存到文件后,您可以使用来自MixerHost的呈现callback来处理数据。

如果我是你,我会使用kAudioUnitSubType_AudioFilePlayer来播放文件,并访问其单元呈现callback。

要么

使用ExtAudioFileRef将样本直接提取到缓冲区。