在iPhone上从线性PCM中提取振幅数据

我无法从存储在audio.caf中的iPhone上的线性PCM中提取振幅数据。

我的问题是：

线性PCM将幅度采样存储为16位值。它是否正确？
AudioFileReadPacketData（）返回的数据包中的幅度是如何存储的？当logging单声道线性PCM时，是不是每个采样（在一个帧中，在一个数据包中）只是一个用于SInt16的数组？什么是字节顺序（大端到小端）？
线性PCM幅度的每一步在物理上意味着什么？
当线性PCM录制在iPhone上时，是中心点0（SInt16）还是32768（UInt16）？最大最小值在物理波形/气压中意味着什么？

和一个奖金问题：iPhone麦克风无法测量的声音/气压波形？

我的代码如下：

// get the audio file proxy object for the audio AudioFileID fileID; AudioFileOpenURL((CFURLRef)audioURL, kAudioFileReadPermission, kAudioFileCAFType, &fileID); // get the number of packets of audio data contained in the file UInt64 totalPacketCount = [self packetCountForAudioFile:fileID]; // get the size of each packet for this audio file UInt32 maxPacketSizeInBytes = [self packetSizeForAudioFile:fileID]; // setup to extract the audio data Boolean inUseCache = false; UInt32 numberOfPacketsToRead = 4410; // 0.1 seconds of data UInt32 ioNumPackets = numberOfPacketsToRead; UInt32 ioNumBytes = maxPacketSizeInBytes * ioNumPackets; char *outBuffer = malloc(ioNumBytes); memset(outBuffer, 0, ioNumBytes); SInt16 signedMinAmplitude = -32768; SInt16 signedCenterpoint = 0; SInt16 signedMaxAmplitude = 32767; SInt16 minAmplitude = signedMaxAmplitude; SInt16 maxAmplitude = signedMinAmplitude; // process each and every packet for (UInt64 packetIndex = 0; packetIndex < totalPacketCount; packetIndex = packetIndex + ioNumPackets) { // reset the number of packets to get ioNumPackets = numberOfPacketsToRead; AudioFileReadPacketData(fileID, inUseCache, &ioNumBytes, NULL, packetIndex, &ioNumPackets, outBuffer); for (UInt32 batchPacketIndex = 0; batchPacketIndex < ioNumPackets; batchPacketIndex++) { SInt16 packetData = outBuffer[batchPacketIndex * maxPacketSizeInBytes]; SInt16 absoluteValue = abs(packetData); if (absoluteValue < minAmplitude) { minAmplitude = absoluteValue; } if (absoluteValue > maxAmplitude) { maxAmplitude = absoluteValue; } } } NSLog(@"minAmplitude: %hi", minAmplitude); NSLog(@"maxAmplitude: %hi", maxAmplitude);

有了这个代码，我几乎总是得到0和最大的128分！这对我来说没有意义。

我正在使用AVAudioRecorder录制audio，如下所示：

 // specify mono, 44.1 kHz, Linear PCM with Max Quality as recording format NSDictionary *recordSettings = [[NSDictionary alloc] initWithObjectsAndKeys: [NSNumber numberWithFloat: 44100.0], AVSampleRateKey, [NSNumber numberWithInt: kAudioFormatLinearPCM], AVFormatIDKey, [NSNumber numberWithInt: 1], AVNumberOfChannelsKey, [NSNumber numberWithInt: AVAudioQualityMax], AVEncoderAudioQualityKey, nil]; // store the sound file in the app doc folder as calibration.caf NSString *documentsDir = [NSSearchPathForDirectoriesInDomains(NSDocumentDirectory, NSUserDomainMask, YES) lastObject]; NSURL *audioFileURL = [NSURL fileURLWithPath:[documentsDir stringByAppendingPathComponent: @"audio.caf"]]; // create the audio recorder NSError *createAudioRecorderError = nil; AVAudioRecorder *newAudioRecorder = [[AVAudioRecorder alloc] initWithURL:audioFileURL settings:recordSettings error:&createAudioRecorderError]; [recordSettings release]; if (newAudioRecorder) { // record the audio self.recorder = newAudioRecorder; [newAudioRecorder release]; self.recorder.delegate = self; [self.recorder prepareToRecord]; [self.recorder record]; } else { NSLog(@"%@", [createAudioRecorderError localizedDescription]); }

感谢您提供的任何见解。这是我使用Core Audio的第一个项目，所以请随时拆开我的方法！

PS我试图search核心audio列表存档，但请求不断提供一个错误：（ http://search.lists.apple.com/?q=linear+pcm+amplitude&cmd=Search%21&ul=coreaudio-api ）

PPS我曾看过：

http://en.wikipedia.org/wiki/Sound_pressure

http://en.wikipedia.org/wiki/Linear_PCM

http://wiki.multimedia.cx/index.php?title=PCM

获取声音文件中给定时间的幅度？

http://music.columbia.edu/pipermail/music-dsp/2002-April/048341.html

我也读了整个核心audio概述和大部分的audio会话编程指南，但我的问题依然存在。

1）os x / iphone文件读取例程允许您确定样本格式，通常为LPCM中的SInt8，SInt16，SInt32，Float32，Float64或连续的24位有符号int中的一个

2）对于int格式，MIN_FOR_TYPE表示负相位的最大幅度，MAX_FOR_TYPE表示正数的最大幅度。 0等于沉默。浮点格式在[-1 … 1]之间调制，零点与float一样。在读取，写入，logging或使用特定格式时，字节顺序将很重要 – 文件可能需要特定的格式，而且您通常希望以本地字节顺序处理数据。苹果audio文件库中的一些例程允许您传递一个表示源字节顺序的标志，而不是手动转换它。 CAF有点复杂 – 它像一个或多个audio文件的元包装一样，并支持多种types。

3）lpcm的幅度表示只是一个蛮力的线性幅度表示（不需要转换/解码来回放，幅度步长相等）。

4）见＃2。这些值与空气压力无关，它们与0 dBFS有关; 例如，如果要直接将数据stream输出到DAC，则int max（或浮点数的-1/1）表示单个采样将被剪切的级别。

奖金），就像每个ADC和组件链都有限制，它可以处理input的电压。此外，采样率定义了可能捕获的最高频率（最高为采样率的一半）。 adc可以使用固定或可选的位深度，但是当select另一个位深度时，最大input电压通常不会改变。

你在代码级别犯的一个错误：你正在操作`outBuffer'作为字符 – 而不是SInt16

如果您要求以您的logging格式的16位采样，那么您将获得16位采样。但是许多Core Audiologging/播放API以及可能的caf文件格式都存在其他格式。
在单声道，你只是得到一个有符号的16位整数数组。在某些Core Audio录制API中，您可以特别要求大或小的序列号。
除非您想要校准特定设备型号的麦克风或外接麦克风（并确保audio处理/ AGC已closures），否则您可能需要考虑将audio电平进行任意缩放。另外响应也随着麦克风的方向性和audio而变化。
16位audio采样的中心点通常为0（范围约为-32k到32k）。没有偏见。

在iPhone上从线性PCM中提取振幅数据

AVCaptureSession取消背景audio

在iPhone上将.caf转换为.mp3

UILocalNotification自定义声音不在iOS7中播放

如何在Swift中使用CoreAudio API

铛：错误：没有这样的文件或目录：ASIAuthenticationDialog.m

如何控制iPhone上的硬件麦克风input增益/电平？

什么是AudioStreamBasicDescription为m4a文件格式

iOS Core Audio：在kAudioFormatFlagsCanonical和kAudioFormatFlagsAudioUnitCanonical之间转换

Cordova媒体插件打破了iOS上的HTML5audio标签

如何在iPhone上录制AMRaudio格式？