iOS Core ML Swift集成指南：模型加载、预测、性能优化与部署

coreml by dpearson2699/swift-ios-skills

495 周安装量

327 GitHub Stars

GitHub

安装命令

npx skills add https://github.com/dpearson2699/swift-ios-skills --skill coreml

AI/机器学习 iOS Swift

🇨🇳中文介绍

Core ML Swift 集成

在 iOS 应用中加载、配置和运行 Core ML 模型。本技能涵盖 Swift 端：模型加载、预测、MLTensor、性能分析和部署。目标为 iOS 26+ 和 Swift 6.2，除非特别说明，向下兼容至 iOS 14。

范围边界： Python 端的模型转换、优化（量化、调色板化、剪枝）和框架选择属于 apple-on-device-ai 技能。本技能仅负责 Swift 集成。

完整代码模式（包括基于 Actor 的缓存、批量推理、图像预处理和测试）请参阅 references/coreml-swift-integration.md。

加载模型

自动生成的类

将 .mlpackage 或 .mlmodelc 拖入 Xcode 时，它会生成一个带有类型化输入/输出的 Swift 类。尽可能使用此方法。

import CoreML

let config = MLModelConfiguration()
config.computeUnits = .all

let model = try MyImageClassifier(configuration: config)

广告位招租

在这里展示您的产品或服务

触达数万 AI 开发者，精准高效

联系我们

异步加载 (iOS 16+)

在不阻塞主线程的情况下加载模型。对于大型模型，推荐使用此方法。

let model = try await MLModel.load(
    contentsOf: modelURL,
    configuration: config
)

在设备上将 .mlpackage 或 .mlmodel 编译为 .mlmodelc。适用于从服务器下载的模型。

let compiledURL = try await MLModel.compileModel(at: packageURL)
let model = try MLModel(contentsOf: compiledURL, configuration: config)

缓存编译后的 URL —— 每次启动都重新编译会浪费时间。将 compiledURL 复制到持久化位置（例如，Application Support 目录）。

MLModelConfiguration 控制计算单元、GPU 访问和模型参数。

计算单元决策表

值	使用	何时选择
`.all`	CPU + GPU + 神经引擎	默认。让系统决定。
`.cpuOnly`	CPU	后台任务、音频会话，或 GPU 繁忙时。
`.cpuAndGPU`	CPU + GPU	需要 GPU 但模型包含 ANE 不支持的操作。
`.cpuAndNeuralEngine`	CPU + 神经引擎	对于兼容的模型，能效最佳。

let config = MLModelConfiguration()
config.computeUnits = .cpuAndNeuralEngine

// 允许低优先级后台推理
config.computeUnits = .cpuOnly

let config = MLModelConfiguration()
config.computeUnits = .all
config.allowLowPrecisionAccumulationOnGPU = true // 更快，精度略有损失

使用自动生成的类

生成的类提供了类型化的输入/输出结构体。

let model = try MyImageClassifier(configuration: config)
let input = MyImageClassifierInput(image: pixelBuffer)
let output = try model.prediction(input: input)
print(output.classLabel)        // "golden_retriever"
print(output.classLabelProbs)   // ["golden_retriever": 0.95, ...]

使用 MLDictionaryFeatureProvider

当输入是动态的或在编译时未知时使用。

let inputFeatures = try MLDictionaryFeatureProvider(dictionary: [
    "image": MLFeatureValue(pixelBuffer: pixelBuffer),
    "confidence_threshold": MLFeatureValue(double: 0.5),
])
let output = try model.prediction(from: inputFeatures)
let label = output.featureValue(for: "classLabel")?.stringValue

异步预测 (iOS 17+)

let output = try await model.prediction(from: inputFeatures)

在一次调用中处理多个输入以获得更好的吞吐量。

let batchInputs = try MLArrayBatchProvider(array: inputs.map { input in
    try MLDictionaryFeatureProvider(dictionary: ["image": MLFeatureValue(pixelBuffer: input)])
})
let batchOutput = try model.predictions(from: batchInputs)
for i in 0..<batchOutput.count {
    let result = batchOutput.features(at: i)
    print(result.featureValue(for: "classLabel")?.stringValue ?? "unknown")
}

有状态预测 (iOS 18+)

对于在多次预测间保持状态的模型（序列模型、LLM、音频累加器），使用 MLState。创建一次状态，并将其传递给每次预测调用。

let state = model.makeState()

// 每次预测都会传递模型内部状态
for frame in audioFrames {
    let input = try MLDictionaryFeatureProvider(dictionary: [
        "audio_features": MLFeatureValue(multiArray: frame)
    ])
    let output = try await model.prediction(from: input, using: state)
    let classification = output.featureValue(for: "label")?.stringValue
}

状态不是 Sendable —— 在单个 actor 或任务中使用它。调用 model.makeState() 为并发流创建独立的状态。

MLTensor 是用于前后处理的 Swift 原生多维数组。操作是惰性执行的 —— 调用 .shapedArray(of:) 来具体化结果。

import CoreML

// 创建
let tensor = MLTensor([1.0, 2.0, 3.0, 4.0])
let zeros = MLTensor(zeros: [3, 224, 224], scalarType: Float.self)

// 重塑
let reshaped = tensor.reshaped(to: [2, 2])

// 数学运算
let softmaxed = tensor.softmax()
let normalized = (tensor - tensor.mean()) / tensor.standardDeviation()

// 与 MLMultiArray 互操作
let multiArray = try MLMultiArray([1.0, 2.0, 3.0, 4.0])
let fromMultiArray = MLTensor(multiArray)
let backToArray = tensor.shapedArray(of: Float.self)

MLMultiArray 是非图像模型输入和输出的主要数据交换类型。当自动生成的类期望数组类型特征时使用它。

// 创建一个 3D 数组：[批次, 序列, 特征]
let array = try MLMultiArray(shape: [1, 128, 768], dataType: .float32)

// 写入值
for i in 0..<128 {
    array[[0, i, 0] as [NSNumber]] = NSNumber(value: Float(i))
}

// 读取值
let value = array[[0, 0, 0] as [NSNumber]].floatValue

// 从数据指针创建，用于零拷贝互操作
let data: [Float] = [1.0, 2.0, 3.0]
let fromData = try MLMultiArray(dataPointer: UnsafeMutableRawPointer(mutating: data),
                                 shape: [3],
                                 dataType: .float32,
                                 strides: [1])

高级 MLMultiArray 模式（包括 NLP 分词和音频特征提取）请参阅 references/coreml-swift-integration.md。

图像模型期望 CVPixelBuffer 输入。对于来自相机或照片库的照片，使用 CGImage 转换。Vision 的 VNCoreMLRequest 会自动处理此过程；仅当直接使用 MLModel 进行预测时才需要手动转换。

import CoreVideo

func createPixelBuffer(from cgImage: CGImage, width: Int, height: Int) -> CVPixelBuffer? {
    var pixelBuffer: CVPixelBuffer?
    let attrs: [CFString: Any] = [
        kCVPixelBufferCGImageCompatibilityKey: true,
        kCVPixelBufferCGBitmapContextCompatibilityKey: true,
    ]
    CVPixelBufferCreate(kCFAllocatorDefault, width, height,
                        kCVPixelFormatType_32ARGB, attrs as CFDictionary, &pixelBuffer)

    guard let buffer = pixelBuffer else { return nil }
    CVPixelBufferLockBaseAddress(buffer, [])
    let context = CGContext(
        data: CVPixelBufferGetBaseAddress(buffer),
        width: width, height: height,
        bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(buffer),
        space: CGColorSpaceCreateDeviceRGB(),
        bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue
    )
    context?.draw(cgImage, in: CGRect(x: 0, y: 0, width: width, height: height))
    CVPixelBufferUnlockBaseAddress(buffer, [])
    return buffer
}

其他预处理模式（归一化、中心裁剪）请参阅 references/coreml-swift-integration.md。

当预处理或后处理需要单独的模型时，将模型串联起来。

// 顺序推理：预处理器 -> 主模型 -> 后处理器
let preprocessed = try preprocessor.prediction(from: rawInput)
let mainOutput = try mainModel.prediction(from: preprocessed)
let finalOutput = try postprocessor.prediction(from: mainOutput)

对于 Xcode 管理的流水线，请在 .mlpackage 中使用流水线模型类型。每个子模型在其最优计算单元上运行。

使用 Vision 运行 Core ML 图像模型，并自动进行图像预处理（调整大小、归一化、色彩空间、方向）。

现代方法：CoreMLRequest (iOS 18+)

import Vision
import CoreML

let model = try MLModel(contentsOf: modelURL, configuration: config)
let request = CoreMLRequest(model: .init(model))
let results = try await request.perform(on: cgImage)

if let classification = results.first as? ClassificationObservation {
    print("\(classification.identifier): \(classification.confidence)")
}

传统方法：VNCoreMLRequest

let vnModel = try VNCoreMLModel(for: model)
let request = VNCoreMLRequest(model: vnModel) { request, error in
    guard let results = request.results as? [VNRecognizedObjectObservation] else { return }
    for observation in results {
        let label = observation.labels.first?.identifier ?? "unknown"
        let confidence = observation.labels.first?.confidence ?? 0
        let boundingBox = observation.boundingBox // 归一化坐标
        print("\(label): \(confidence) at \(boundingBox)")
    }
}
request.imageCropAndScaleOption = .scaleFill

let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer)
try handler.perform([request])

完整的 Vision 框架模式（文本识别、条形码检测、文档扫描）请参阅 vision-framework 技能。

MLComputePlan (iOS 17.4+)

在运行预测之前，检查每个操作将使用哪个计算设备。

let computePlan = try await MLComputePlan.load(
    contentsOf: modelURL, configuration: config
)
guard case let .program(program) = computePlan.modelStructure else { return }
guard let mainFunction = program.functions["main"] else { return }

for operation in mainFunction.block.operations {
    let deviceUsage = computePlan.deviceUsage(for: operation)
    let estimatedCost = computePlan.estimatedCost(of: operation)
    print("\(operation.operatorName): \(deviceUsage?.preferredComputeDevice ?? "unknown")")
}

使用 Instruments 中的 Core ML 仪器模板来分析：

模型加载时间
预测延迟（按操作细分）
计算设备调度（每个操作的 CPU/GPU/ANE）
内存分配

为了获得准确的结果，请在调试器外运行（Xcode：产品 > 性能分析）。

Bundle 与按需资源

策略	优点	缺点
打包在应用中	即时可用，离线工作	增加应用下载大小
按需资源	初始下载更小	首次使用前需要下载
后台资源 (iOS 16+)	提前下载	设置更复杂
CloudKit / 服务器	最大灵活性	需要网络，设置时间更长

App Store 限制：应用 bundle 为 4 GB
蜂窝网络下载限制：200 MB（可申请例外）
对于 > 50 MB 的模型使用 ODR 标签
预编译为 .mlmodelc 以跳过设备端编译

// 按需资源加载 let request = NSBundleResourceRequest(tags: ["ml-model-v2"]) try await request.beginAccessingResources() let modelURL = Bundle.main.url(forResource: "LargeModel", withExtension: "mlmodelc")! let model = try await MLModel.load(contentsOf: modelURL, configuration: config) // 完成后调用 request.endAccessingResources()

后台卸载： 当应用进入后台时，释放模型引用以释放 GPU/ANE 内存。返回前台时重新加载。
后台任务使用 .cpuOnly： 后台处理不能使用 GPU 或 ANE；设置 .cpuOnly 可以避免静默回退和资源争用。
共享模型实例： 切勿从同一个已编译模型创建多个 MLModel 实例。使用 actor 来提供共享访问。
监控内存压力： 大型模型（>100 MB）可能触发内存警告。注册 UIApplication.didReceiveMemoryWarningNotification 并在内存压力下释放缓存的模型。

具有生命周期感知加载和缓存逐出的基于 Actor 的模型管理器，请参阅 references/coreml-swift-integration.md。

不要： 在主线程上加载模型。要：使用 MLModel.load(contentsOf:configuration:) 异步 API 或在后台 actor 上加载。原因： 大型模型可能需要数秒加载，导致 UI 冻结。

不要： 每次应用启动都重新编译 .mlpackage 到 .mlmodelc。要：使用 MLModel.compileModel(at:) 编译一次，并持久化缓存编译后的 URL。原因： 编译开销很大。将 .mlmodelc 缓存在 Application Support 目录中。

不要： 除非有特定原因，否则不要硬编码 .cpuOnly。要：使用 .all 并让系统选择最优计算单元。原因： .all 启用神经引擎和 GPU，它们更快且更节能。

不要： 忽略输入与模型期望之间的 MLFeatureValue 类型不匹配。要：精确匹配类型 —— 对于图像使用 MLFeatureValue(pixelBuffer:)，而不是原始数据。原因： 类型不匹配会导致难以理解的运行时崩溃或静默的错误结果。

不要： 为每次预测创建新的 MLModel 实例。要：加载一次并重复使用。使用 actor 来管理模型生命周期。原因： 模型加载会分配大量内存和计算资源。

不要： 跳过模型加载和预测的错误处理。要：捕获错误并在模型失败时提供回退行为。原因： 在旧设备上或资源受限时，模型可能无法加载。

不要： 假设所有操作都在神经引擎上运行。要：使用 MLComputePlan (iOS 17.4+) 来验证每个操作的设备调度。原因： 不支持的操作会回退到 CPU，这可能成为流水线的瓶颈。

不要： 在传递给 Vision + Core ML 之前手动处理图像。要：使用 CoreMLRequest (iOS 18+) 或 VNCoreMLRequest（传统）让 Vision 处理预处理。原因： Vision 能正确处理方向、缩放和像素格式转换。

模型异步加载（不阻塞主线程）
MLModelConfiguration.computeUnits 根据用例适当设置
模型实例在多次预测间重复使用（不每次重新创建）
可用时使用自动生成的类（类型化输入/输出）
对模型加载和预测失败进行错误处理
如果在运行时编译，编译后的模型被持久化缓存
图像输入使用 Vision 流水线（CoreMLRequest iOS 18+ 或 VNCoreMLRequest）进行正确的预处理
检查 MLComputePlan 以验证计算设备调度 (iOS 17.4+)
处理多个输入时使用批量预测
模型大小适合部署策略（bundle 与 ODR）
在目标设备（尤其是内存较少的旧设备）上进行内存测试
预测在调试器外运行以获得准确的性能测量

模式和代码：references/coreml-swift-integration.md
模型转换和优化（Python 端）：涵盖在 apple-on-device-ai 技能中
Apple 文档：Core ML | MLModel | MLComputePlan

🇺🇸English

Core ML Swift Integration

Load, configure, and run Core ML models in iOS apps. This skill covers the Swift side: model loading, prediction, MLTensor, profiling, and deployment. Target iOS 26+ with Swift 6.2, backward-compatible to iOS 14 unless noted.

Scope boundary: Python-side model conversion, optimization (quantization, palettization, pruning), and framework selection live in the apple-on-device-ai skill. This skill owns Swift integration only.

See references/coreml-swift-integration.md for complete code patterns including actor-based caching, batch inference, image preprocessing, and testing.

Loading Models
Model Configuration
Making Predictions
MLTensor (iOS 18+)
Working with MLMultiArray
Image Preprocessing
Multi-Model Pipelines
Vision Integration
Performance Profiling
Model Deployment
Memory Management
Common Mistakes
Review Checklist
References

Loading Models

Auto-Generated Classes

When you drag a .mlpackage or .mlmodelc into Xcode, it generates a Swift class with typed input/output. Use this whenever possible.

import CoreML

let config = MLModelConfiguration()
config.computeUnits = .all

let model = try MyImageClassifier(configuration: config)

Manual Loading

Load from a URL when the model is downloaded at runtime or stored outside the bundle.

let modelURL = Bundle.main.url(
    forResource: "MyModel", withExtension: "mlmodelc"
)!
let model = try MLModel(contentsOf: modelURL, configuration: config)

Async Loading (iOS 16+)

Load models without blocking the main thread. Prefer this for large models.

let model = try await MLModel.load(
    contentsOf: modelURL,
    configuration: config
)

Compile at Runtime

Compile a .mlpackage or .mlmodel to .mlmodelc on device. Useful for models downloaded from a server.

let compiledURL = try await MLModel.compileModel(at: packageURL)
let model = try MLModel(contentsOf: compiledURL, configuration: config)

Cache the compiled URL -- recompiling on every launch wastes time. Copy compiledURL to a persistent location (e.g., Application Support).

Model Configuration

MLModelConfiguration controls compute units, GPU access, and model parameters.

Compute Units Decision Table

Value	Uses	When to Choose
`.all`	CPU + GPU + Neural Engine	Default. Let the system decide.
`.cpuOnly`	CPU	Background tasks, audio sessions, or when GPU is busy.
`.cpuAndGPU`	CPU + GPU	Need GPU but model has ops unsupported by ANE.
`.cpuAndNeuralEngine`	CPU + Neural Engine	Best energy efficiency for compatible models.

let config = MLModelConfiguration()
config.computeUnits = .cpuAndNeuralEngine

// Allow low-priority background inference
config.computeUnits = .cpuOnly

Configuration Properties

let config = MLModelConfiguration()
config.computeUnits = .all
config.allowLowPrecisionAccumulationOnGPU = true // faster, slight precision loss

Making Predictions

With Auto-Generated Classes

The generated class provides typed input/output structs.

let model = try MyImageClassifier(configuration: config)
let input = MyImageClassifierInput(image: pixelBuffer)
let output = try model.prediction(input: input)
print(output.classLabel)        // "golden_retriever"
print(output.classLabelProbs)   // ["golden_retriever": 0.95, ...]

With MLDictionaryFeatureProvider

Use when inputs are dynamic or not known at compile time.

let inputFeatures = try MLDictionaryFeatureProvider(dictionary: [
    "image": MLFeatureValue(pixelBuffer: pixelBuffer),
    "confidence_threshold": MLFeatureValue(double: 0.5),
])
let output = try model.prediction(from: inputFeatures)
let label = output.featureValue(for: "classLabel")?.stringValue

Async Prediction (iOS 17+)

let output = try await model.prediction(from: inputFeatures)

Batch Prediction

Process multiple inputs in one call for better throughput.

let batchInputs = try MLArrayBatchProvider(array: inputs.map { input in
    try MLDictionaryFeatureProvider(dictionary: ["image": MLFeatureValue(pixelBuffer: input)])
})
let batchOutput = try model.predictions(from: batchInputs)
for i in 0..<batchOutput.count {
    let result = batchOutput.features(at: i)
    print(result.featureValue(for: "classLabel")?.stringValue ?? "unknown")
}

Stateful Prediction (iOS 18+)

Use MLState for models that maintain state across predictions (sequence models, LLMs, audio accumulators). Create state once and pass it to each prediction call.

let state = model.makeState()

// Each prediction carries forward the internal model state
for frame in audioFrames {
    let input = try MLDictionaryFeatureProvider(dictionary: [
        "audio_features": MLFeatureValue(multiArray: frame)
    ])
    let output = try await model.prediction(from: input, using: state)
    let classification = output.featureValue(for: "label")?.stringValue
}

State is not Sendable -- use it from a single actor or task. Call model.makeState() to create independent state for concurrent streams.

MLTensor (iOS 18+)

MLTensor is a Swift-native multidimensional array for pre/post-processing. Operations run lazily -- call .shapedArray(of:) to materialize results.

import CoreML

// Creation
let tensor = MLTensor([1.0, 2.0, 3.0, 4.0])
let zeros = MLTensor(zeros: [3, 224, 224], scalarType: Float.self)

// Reshaping
let reshaped = tensor.reshaped(to: [2, 2])

// Math operations
let softmaxed = tensor.softmax()
let normalized = (tensor - tensor.mean()) / tensor.standardDeviation()

// Interop with MLMultiArray
let multiArray = try MLMultiArray([1.0, 2.0, 3.0, 4.0])
let fromMultiArray = MLTensor(multiArray)
let backToArray = tensor.shapedArray(of: Float.self)

Working with MLMultiArray

MLMultiArray is the primary data exchange type for non-image model inputs and outputs. Use it when the auto-generated class expects array-type features.

// Create a 3D array: [batch, sequence, features]
let array = try MLMultiArray(shape: [1, 128, 768], dataType: .float32)

// Write values
for i in 0..<128 {
    array[[0, i, 0] as [NSNumber]] = NSNumber(value: Float(i))
}

// Read values
let value = array[[0, 0, 0] as [NSNumber]].floatValue

// Create from data pointer for zero-copy interop
let data: [Float] = [1.0, 2.0, 3.0]
let fromData = try MLMultiArray(dataPointer: UnsafeMutableRawPointer(mutating: data),
                                 shape: [3],
                                 dataType: .float32,
                                 strides: [1])

See references/coreml-swift-integration.md for advanced MLMultiArray patterns including NLP tokenization and audio feature extraction.

Image Preprocessing

Image models expect CVPixelBuffer input. Use CGImage conversion for photos from the camera or photo library. Vision's VNCoreMLRequest handles this automatically; manual conversion is needed only for direct MLModel prediction.

import CoreVideo

func createPixelBuffer(from cgImage: CGImage, width: Int, height: Int) -> CVPixelBuffer? {
    var pixelBuffer: CVPixelBuffer?
    let attrs: [CFString: Any] = [
        kCVPixelBufferCGImageCompatibilityKey: true,
        kCVPixelBufferCGBitmapContextCompatibilityKey: true,
    ]
    CVPixelBufferCreate(kCFAllocatorDefault, width, height,
                        kCVPixelFormatType_32ARGB, attrs as CFDictionary, &pixelBuffer)

    guard let buffer = pixelBuffer else { return nil }
    CVPixelBufferLockBaseAddress(buffer, [])
    let context = CGContext(
        data: CVPixelBufferGetBaseAddress(buffer),
        width: width, height: height,
        bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(buffer),
        space: CGColorSpaceCreateDeviceRGB(),
        bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue
    )
    context?.draw(cgImage, in: CGRect(x: 0, y: 0, width: width, height: height))
    CVPixelBufferUnlockBaseAddress(buffer, [])
    return buffer
}

For additional preprocessing patterns (normalization, center-cropping), see references/coreml-swift-integration.md.

Multi-Model Pipelines

Chain models when preprocessing or postprocessing requires a separate model.

// Sequential inference: preprocessor -> main model -> postprocessor
let preprocessed = try preprocessor.prediction(from: rawInput)
let mainOutput = try mainModel.prediction(from: preprocessed)
let finalOutput = try postprocessor.prediction(from: mainOutput)

For Xcode-managed pipelines, use the pipeline model type in the .mlpackage. Each sub-model runs on its optimal compute unit.

Vision Integration

Use Vision to run Core ML image models with automatic image preprocessing (resizing, normalization, color space, orientation).

Modern: CoreMLRequest (iOS 18+)

import Vision
import CoreML

let model = try MLModel(contentsOf: modelURL, configuration: config)
let request = CoreMLRequest(model: .init(model))
let results = try await request.perform(on: cgImage)

if let classification = results.first as? ClassificationObservation {
    print("\(classification.identifier): \(classification.confidence)")
}

Legacy: VNCoreMLRequest

let vnModel = try VNCoreMLModel(for: model)
let request = VNCoreMLRequest(model: vnModel) { request, error in
    guard let results = request.results as? [VNRecognizedObjectObservation] else { return }
    for observation in results {
        let label = observation.labels.first?.identifier ?? "unknown"
        let confidence = observation.labels.first?.confidence ?? 0
        let boundingBox = observation.boundingBox // normalized coordinates
        print("\(label): \(confidence) at \(boundingBox)")
    }
}
request.imageCropAndScaleOption = .scaleFill

let handler = VNImageRequestHandler(cvPixelBuffer: pixelBuffer)
try handler.perform([request])

For complete Vision framework patterns (text recognition, barcode detection, document scanning), see the vision-framework skill.

Performance Profiling

MLComputePlan (iOS 17.4+)

Inspect which compute device each operation will use before running predictions.

let computePlan = try await MLComputePlan.load(
    contentsOf: modelURL, configuration: config
)
guard case let .program(program) = computePlan.modelStructure else { return }
guard let mainFunction = program.functions["main"] else { return }

for operation in mainFunction.block.operations {
    let deviceUsage = computePlan.deviceUsage(for: operation)
    let estimatedCost = computePlan.estimatedCost(of: operation)
    print("\(operation.operatorName): \(deviceUsage?.preferredComputeDevice ?? "unknown")")
}

Instruments

Use the Core ML instrument template in Instruments to profile:

Model load time
Prediction latency (per-operation breakdown)
Compute device dispatch (CPU/GPU/ANE per operation)
Memory allocation

Run outside the debugger for accurate results (Xcode: Product > Profile).

Model Deployment

Bundle vs On-Demand Resources

Strategy	Pros	Cons
Bundle in app	Instant availability, works offline	Increases app download size
On-demand resources	Smaller initial download	Requires download before first use
Background Assets (iOS 16+)	Downloads ahead of time	More complex setup
CloudKit / server	Maximum flexibility	Requires network, longer setup

Size Considerations

App Store limit: 4 GB for app bundle
Cellular download limit: 200 MB (can request exception)
Use ODR tags for models > 50 MB
Pre-compile to .mlmodelc to skip on-device compilation

// On-demand resource loading let request = NSBundleResourceRequest(tags: ["ml-model-v2"]) try await request.beginAccessingResources() let modelURL = Bundle.main.url(forResource: "LargeModel", withExtension: "mlmodelc")! let model = try await MLModel.load(contentsOf: modelURL, configuration: config) // Call request.endAccessingResources() when done

Memory Management

Unload on background: Release model references when the app enters background to free GPU/ANE memory. Reload on foreground return.
Use.cpuOnly for background tasks: Background processing cannot use GPU or ANE; setting .cpuOnly avoids silent fallback and resource contention.
Share model instances: Never create multiple MLModel instances from the same compiled model. Use an actor to provide shared access.
Monitor memory pressure: Large models (>100 MB) can trigger memory warnings. Register for UIApplication.didReceiveMemoryWarningNotification and release cached models when under pressure.

See references/coreml-swift-integration.md for an actor-based model manager with lifecycle-aware loading and cache eviction.

Common Mistakes

DON'T: Load models on the main thread. DO: Use MLModel.load(contentsOf:configuration:) async API or load on a background actor. Why: Large models can take seconds to load, freezing the UI.

DON'T: Recompile .mlpackage to .mlmodelc on every app launch. DO: Compile once with MLModel.compileModel(at:) and cache the compiled URL persistently. Why: Compilation is expensive. Cache the .mlmodelc in Application Support.

DON'T: Hardcode .cpuOnly unless you have a specific reason. DO: Use .all and let the system choose the optimal compute unit. Why: .all enables Neural Engine and GPU, which are faster and more energy-efficient.

DON'T: Ignore MLFeatureValue type mismatches between input and model expectations. DO: Match types exactly -- use MLFeatureValue(pixelBuffer:) for images, not raw data. Why: Type mismatches cause cryptic runtime crashes or silent incorrect results.

DON'T: Create a new MLModel instance for every prediction. DO: Load once and reuse. Use an actor to manage the model lifecycle. Why: Model loading allocates significant memory and compute resources.

DON'T: Skip error handling for model loading and prediction. DO: Catch errors and provide fallback behavior when the model fails. Why: Models can fail to load on older devices or when resources are constrained.

DON'T: Assume all operations run on the Neural Engine. DO: Use MLComputePlan (iOS 17.4+) to verify device dispatch per operation. Why: Unsupported operations fall back to CPU, which may bottleneck the pipeline.

DON'T: Process images manually before passing to Vision + Core ML. DO: Use CoreMLRequest (iOS 18+) or VNCoreMLRequest (legacy) to let Vision handle preprocessing. Why: Vision handles orientation, scaling, and pixel format conversion correctly.

Review Checklist

Model loaded asynchronously (not blocking main thread)
MLModelConfiguration.computeUnits set appropriately for use case
Model instance reused across predictions (not recreated each time)
Auto-generated class used when available (typed inputs/outputs)
Error handling for model loading and prediction failures
Compiled model cached persistently if compiled at runtime
Image inputs use Vision pipeline (CoreMLRequest iOS 18+ or VNCoreMLRequest) for correct preprocessing
MLComputePlan checked to verify compute device dispatch (iOS 17.4+)
Batch predictions used when processing multiple inputs
Model size appropriate for deployment strategy (bundle vs ODR)
Memory tested on target devices (especially older devices with less RAM)
Predictions run outside debugger for accurate performance measurement

References

Patterns and code: references/coreml-swift-integration.md
Model conversion and optimization (Python-side): covered in the apple-on-device-ai skill
Apple docs: Core ML | MLModel | MLComputePlan

Weekly Installs

334

Repository

dpearson2699/sw…s-skills

GitHub Stars

269

First Seen

Mar 8, 2026

Security Audits

Gen Agent Trust HubPass SocketPass SnykPass

Installed on

codex331

cursor328

amp328

cline328

github-copilot328

kimi-cli328

超能力技能使用指南：AI助手技能调用优先级与工作流程详解

41,800 周安装

iOS Core ML Swift集成指南：模型加载、预测、性能优化与部署

🇨🇳中文介绍

Core ML Swift 集成

目录

加载模型

自动生成的类

相关 Skills

手动加载

异步加载 (iOS 16+)

运行时编译

模型配置

计算单元决策表

配置属性

进行预测

使用自动生成的类

使用 MLDictionaryFeatureProvider

异步预测 (iOS 17+)

批量预测

有状态预测 (iOS 18+)

MLTensor (iOS 18+)

使用 MLMultiArray

图像预处理

多模型流水线

Vision 集成

现代方法：CoreMLRequest (iOS 18+)

传统方法：VNCoreMLRequest

性能分析

MLComputePlan (iOS 17.4+)

Instruments

模型部署

Bundle 与按需资源

大小考虑

内存管理

常见错误

审查清单

参考资料

🇺🇸English

Core ML Swift Integration

Contents

Loading Models

Auto-Generated Classes

Manual Loading

Async Loading (iOS 16+)

Compile at Runtime

Model Configuration

Compute Units Decision Table

Configuration Properties

Making Predictions

With Auto-Generated Classes

With MLDictionaryFeatureProvider

Async Prediction (iOS 17+)

Batch Prediction

Stateful Prediction (iOS 18+)

MLTensor (iOS 18+)

Working with MLMultiArray

Image Preprocessing

Multi-Model Pipelines

Vision Integration

Modern: CoreMLRequest (iOS 18+)

Legacy: VNCoreMLRequest

Performance Profiling

MLComputePlan (iOS 17.4+)

Instruments

Model Deployment

Bundle vs On-Demand Resources

Size Considerations

Memory Management

Common Mistakes

Review Checklist

References

最新 Skills