CoreML

This post is my notes collected while watching Apple’s WWDC 2017 Session 703, “Introducing Core ML” and Session 710, “Core ML in Depth.”

Introducing CoreML

Apple Uses

Photos: People & Scence Recognition
Keyboard: Next word prediction, smart responses
Watch: Smart responses & handwriting recognition

Examples:

Real Time Image recognition
Text prediction
Entity recognition
Handwriting recognition
Style transfer
Sentiment analysis
Search ranking
Machine translation
Image captioning
Personalization
Face detection
Emotion detection
Speaker identification
Music tagging
Text summarization

Whoa, that’s lot of examples.

Example

Recognizing a rose:

Start by color
Try shape
… It gets complicated fast

Rather than describing how a rose looks like programatically, we will describe rose emperically.

– Gaurav Kapoor

Two Steps to Machine Learning

Training
Inference

Training a Model

Collect sample data (“roses, sunflowers, lillies”)
Pass through a learning algorithm
Generate a model

Inference

Pass an image into the model
Get back a result and confidence level.

Challenges:

Prove correctness
Performance
Energy Efficiency

Three frameworks: Vision & NLP sit on top of CoreML.

CoreML is built on top of Accelerate and MPS.

Domain agnostic
Inputs: Images, text, dictionaries, raw number.
Accelerate is good for math functionality.

Advantages to Running Locally

Privacy
Data Cost
No server cost
Always available

Real Time Image Recognition

No latency

Overview

Xcode integrations

Models

A model is a function that “happens to be learned from data.” Each takes an input and gives a an output.

Neural Network Types:

Feed Forward Neural Networks (image/video)
Convolutional Neural Networks
Recurrent Neural Networks (text based applications)
Tree Ensembles
Support Vector Machines
Generalized Linear Models

Focus on the use-case and let CoreML handle the details. Models are single documents.

Inputs, types, outputs
Structure of neural network
training parameters

Where do models come from?

Developer.apple.com has some ready-to-use models.
The machine learning community: – Caffe – Keras – dmlc XGBoost – scikit learn – turi – libsvm

For converting data to CoreML format, use Apple Core ML Tools Python package.

Development Flow

Collect data
Train model
Drag model into Xcode.

Xcode shows name, filesize, author, license, inputs and outputs. It also generates Swift code asynchronously, for loading and predicting against the model.

Model sizes: How does compression work?
Type of file is abstracted.
Strongly typed inputs.

Generated Source

Input, output, and classifier classes.
Offers access to the underlying MLModel for programatic access.
MLModel has an MLModelDescription and another conformance-based (?) prediction method.
MLModel is JSON based.

Core ML Depth

CoreML provides a functional abstraction for machine learning models.

Types of CoreML Inputs

Numeric: Double, Int64
Categories: String, Int64
Images: CVPixelBuffer
Arrays: MLMultiArray (New type - why?)
Dictionaries: [String: Double], [Int64: Double]

Working With Text

Sentiment analysis example: Takes text and passes to the model and the model returns an emoji (happy/ok/sad.)

Approach: Operates as word counts.
NSLinguisticTagger to tokenize and count words
A “Pipeline Classifier” does a few things before returning a prediction. Takes a dictionary, returns a sentiment label, and sentiment scores between 0 and 1.