将图像转换为CVPixelBuffer进行机器学习Swift
我正在尝试在2017年的WWDC上演示的苹果示例Core ML型号正常工作。 我正在使用GoogLeNet尝试对图像进行分类(请参阅Apple机器学习页面 )。 该模型将CVPixelBuffer作为input。 我有一个名为imageSample.jpg的图片,我正在使用这个演示。 我的代码如下:
var sample = UIImage(named: "imageSample")?.cgImage let bufferThree = getCVPixelBuffer(sample!) let model = GoogLeNetPlaces() guard let output = try? model.prediction(input: GoogLeNetPlacesInput.init(sceneImage: bufferThree!)) else { fatalError("Unexpected runtime error.") } print(output.sceneLabel)
我总是得到输出中意外的运行时错误,而不是图像分类。 我的代码转换图像如下:
func getCVPixelBuffer(_ image: CGImage) -> CVPixelBuffer? { let imageWidth = Int(image.width) let imageHeight = Int(image.height) let attributes : [NSObject:AnyObject] = [ kCVPixelBufferCGImageCompatibilityKey : true as AnyObject, kCVPixelBufferCGBitmapContextCompatibilityKey : true as AnyObject ] var pxbuffer: CVPixelBuffer? = nil CVPixelBufferCreate(kCFAllocatorDefault, imageWidth, imageHeight, kCVPixelFormatType_32ARGB, attributes as CFDictionary?, &pxbuffer) if let _pxbuffer = pxbuffer { let flags = CVPixelBufferLockFlags(rawValue: 0) CVPixelBufferLockBaseAddress(_pxbuffer, flags) let pxdata = CVPixelBufferGetBaseAddress(_pxbuffer) let rgbColorSpace = CGColorSpaceCreateDeviceRGB(); let context = CGContext(data: pxdata, width: imageWidth, height: imageHeight, bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(_pxbuffer), space: rgbColorSpace, bitmapInfo: CGImageAlphaInfo.premultipliedFirst.rawValue) if let _context = context { _context.draw(image, in: CGRect.init(x: 0, y: 0, width: imageWidth, height: imageHeight)) } else { CVPixelBufferUnlockBaseAddress(_pxbuffer, flags); return nil } CVPixelBufferUnlockBaseAddress(_pxbuffer, flags); return _pxbuffer; } return nil }
我从以前的StackOverflowpost得到了这个代码(最后回答在这里 )。 我意识到代码可能不正确,但我不知道如何自己做这个。 我相信这是包含错误的部分。 该模型需要以下types的input: Image<RGB,224,224>
你不需要做一堆图像来改变自己的图像,使用核心ML模型 – 新的视觉框架可以为你做。
import Vision import CoreML let model = try VNCoreMLModel(for: MyCoreMLGeneratedModelClass().model) let request = VNCoreMLRequest(model: model, completionHandler: myResultsMethod) let handler = VNImageRequestHandler(url: myImageURL) handler.perform([request]) func myResultsMethod(request: VNRequest, error: Error?) { guard let results = request.results as? [VNClassificationObservation] else { fatalError("huh") } for classification in results { print(classification.identifier, // the scene label classification.confidence) } }
WWDC第17届会议应该有更多的信息 – 明天下午。
你可以使用一个纯粹的CoreML,但你应该调整一个图像的大小(224,224)
DispatchQueue.global(qos: .userInitiated).async { // Resnet50 expects an image 224 x 224, so we should resize and crop the source image let inputImageSize: CGFloat = 224.0 let minLen = min(image.size.width, image.size.height) let resizedImage = image.resize(to: CGSize(width: inputImageSize * image.size.width / minLen, height: inputImageSize * image.size.height / minLen)) let cropedToSquareImage = resizedImage.cropToSquare() guard let pixelBuffer = cropedToSquareImage?.pixelBuffer() else { fatalError() } guard let classifierOutput = try? self.classifier.prediction(image: pixelBuffer) else { fatalError() } DispatchQueue.main.async { self.title = classifierOutput.classLabel } } // ... extension UIImage { func resize(to newSize: CGSize) -> UIImage { UIGraphicsBeginImageContextWithOptions(CGSize(width: newSize.width, height: newSize.height), true, 1.0) self.draw(in: CGRect(x: 0, y: 0, width: newSize.width, height: newSize.height)) let resizedImage = UIGraphicsGetImageFromCurrentImageContext()! UIGraphicsEndImageContext() return resizedImage } func cropToSquare() -> UIImage? { guard let cgImage = self.cgImage else { return nil } var imageHeight = self.size.height var imageWidth = self.size.width if imageHeight > imageWidth { imageHeight = imageWidth } else { imageWidth = imageHeight } let size = CGSize(width: imageWidth, height: imageHeight) let x = ((CGFloat(cgImage.width) - size.width) / 2).rounded() let y = ((CGFloat(cgImage.height) - size.height) / 2).rounded() let cropRect = CGRect(x: x, y: y, width: size.height, height: size.width) if let croppedCgImage = cgImage.cropping(to: cropRect) { return UIImage(cgImage: croppedCgImage, scale: 0, orientation: self.imageOrientation) } return nil } func pixelBuffer() -> CVPixelBuffer? { let width = self.size.width let height = self.size.height let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue, kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] as CFDictionary var pixelBuffer: CVPixelBuffer? let status = CVPixelBufferCreate(kCFAllocatorDefault, Int(width), Int(height), kCVPixelFormatType_32ARGB, attrs, &pixelBuffer) guard let resultPixelBuffer = pixelBuffer, status == kCVReturnSuccess else { return nil } CVPixelBufferLockBaseAddress(resultPixelBuffer, CVPixelBufferLockFlags(rawValue: 0)) let pixelData = CVPixelBufferGetBaseAddress(resultPixelBuffer) let rgbColorSpace = CGColorSpaceCreateDeviceRGB() guard let context = CGContext(data: pixelData, width: Int(width), height: Int(height), bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(resultPixelBuffer), space: rgbColorSpace, bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue) else { return nil } context.translateBy(x: 0, y: height) context.scaleBy(x: 1.0, y: -1.0) UIGraphicsPushContext(context) self.draw(in: CGRect(x: 0, y: 0, width: width, height: height)) UIGraphicsPopContext() CVPixelBufferUnlockBaseAddress(resultPixelBuffer, CVPixelBufferLockFlags(rawValue: 0)) return resultPixelBuffer } }
您可以在mimodel
文件中find的input的预期图像大小:
使用纯CoreML和Vision变体的演示项目可以在这里find: https : //github.com/handsomecode/iOS11-Demos/tree/coreml_vision/CoreML/CoreMLDemo