Swift FFT – 复杂分裂问题

我正在尝试使用Accelerate框架对音频文件执行FFT以查找频率。 我从这个问题中修改了代码(可能是错的): 使用Swift中的Accelerate框架从AVAudioPCMBuffer中获取频谱图

虽然,“ 光谱 ”的幅度要么是’ 0 ‘,’ inf ‘,要么是’ nan ‘,而复杂分裂的’ 真实 ‘和’ 成像 ‘成分会产生相似的结果; 表明这是导致问题的原因:’ magnitude = sqrt(pow( real ,2)+ pow( imag ,2) ‘。如果我错了,请纠正我,但我认为其余的代码都没问题。

为什么我会收到这些结果以及如何解决它(拆分组件应该是什么),以及我做错了什么? 请记住,我对FFT和采样都很陌生,并且不知道如何为音频文件设置它,所以任何帮助都将非常感激。 谢谢。

这是我正在使用的代码:

// get audio file let fileURL:NSURL = NSBundle.mainBundle().URLForResource("foo", withExtension: "mp3")! let audioFile = try! AVAudioFile(forReading: fileURL) let fileFormat = audioFile.processingFormat let frameCount = UInt32(audioFile.length) let buffer = AVAudioPCMBuffer(PCMFormat: fileFormat, frameCapacity: frameCount) let audioEngine = AVAudioEngine() let playerNode = AVAudioPlayerNode() audioMixerNode = audioEngine.mainMixerNode let bufferSize = Int(frameCount) let channels: NSArray = [Int(buffer.format.channelCount)] let channelCount = channels.count let floats1 = [Int(buffer.frameLength)] for var i=0; i<channelCount; ++i { channelSamples.append([]) let firstSample = buffer.format.interleaved ? i : i*bufferSize for var j=firstSample; j<bufferSize; j+=buffer.stride*2 { channelSamples[i].append(DSPComplex(real: buffer.floatChannelData.memory[j], imag: buffer.floatChannelData.memory[j+buffer.stride])) } } // connect node audioEngine.attachNode(playerNode) audioEngine.connect(playerNode, to: audioMixerNode, format: playerNode.outputFormatForBus(0)) // Set up the transform let log2n = UInt(round(log2(Double(bufferSize)))) let fftSetup = vDSP_create_fftsetup(log2n, Int32(kFFTRadix2)) // Create the complex split value to hold the output of the transform // why doesn't this work? var realp = [Float](count: bufferSize/2, repeatedValue: 0) var imagp = [Float](count: bufferSize/2, repeatedValue: 0) var output = DSPSplitComplex(realp: &realp, imagp: &imagp) vDSP_ctoz(UnsafePointer(channelSamples), 2, &output, 1, UInt(bufferSize / 2)) // Do the fast Fourier forward transform vDSP_fft_zrip(fftSetup, &output, 1, log2n, Int32(FFT_FORWARD)) // Convert the complex output to magnitude var fft = [Float](count:Int(bufferSize / 2), repeatedValue:0.0) let bufferOver2: vDSP_Length = vDSP_Length(bufferSize / 2) vDSP_zvmags(&output, 1, &fft, 1, bufferOver2) var spectrum = [Float]() for var i=0; i<bufferSize/2; ++i { let imag = output.imagp[i] let real = output.realp[i] let magnitude = sqrt(pow(real,2)+pow(imag,2)) spectrum.append(magnitude) } // Release the setup vDSP_destroy_fftsetup(fftSetup) 

您的代码存在一些问题:

  1. 你没有读过音频文件样本
  2. channelSamples打包错误
  3. vDSP_fft_zrip正在超出数组末尾读取。 它预计2 ^ log2n样本
  4. vDSP_fft_zrip的输出已打包 ,您的计算需要解压缩

     let fileURL:NSURL = NSBundle.mainBundle().URLForResource("foo", withExtension: "mp3")! let audioFile = try! AVAudioFile(forReading: fileURL) let frameCount = UInt32(audioFile.length) let buffer = AVAudioPCMBuffer(PCMFormat: audioFile.processingFormat, frameCapacity: frameCount) do { try audioFile.readIntoBuffer(buffer, frameCount:frameCount) } catch { } let log2n = UInt(round(log2(Double(frameCount)))) let bufferSizePOT = Int(1 << log2n) // Set up the transform let fftSetup = vDSP_create_fftsetup(log2n, Int32(kFFTRadix2)) // create packed real input var realp = [Float](count: bufferSizePOT/2, repeatedValue: 0) var imagp = [Float](count: bufferSizePOT/2, repeatedValue: 0) var output = DSPSplitComplex(realp: &realp, imagp: &imagp) vDSP_ctoz(UnsafePointer(buffer.floatChannelData.memory), 2, &output, 1, UInt(bufferSizePOT / 2)) // Do the fast Fourier forward transform, packed input to packed output vDSP_fft_zrip(fftSetup, &output, 1, log2n, Int32(FFT_FORWARD)) // you can calculate magnitude squared here, with care // as the first result is wrong! read up on packed formats var fft = [Float](count:Int(bufferSizePOT / 2), repeatedValue:0.0) let bufferOver2: vDSP_Length = vDSP_Length(bufferSizePOT / 2) vDSP_zvmags(&output, 1, &fft, 1, bufferOver2) // Release the setup vDSP_destroy_fftsetup(fftSetup)