Learning Machine Learning with CoreML
CoreML
This post is my notes collected while watching Apple’s WWDC 2017 Session 703, “Introducing Core ML” and Session 710, “Core ML in Depth.”
Introducing CoreML
Apple Uses
- Photos: People & Scence Recognition
- Keyboard: Next word prediction, smart responses
- Watch: Smart responses & handwriting recognition
Examples:
- Real Time Image recognition
- Text prediction
- Entity recognition
- Handwriting recognition
- Style transfer
- Sentiment analysis
- Search ranking
- Machine translation
- Image captioning
- Personalization
- Face detection
- Emotion detection
- Speaker identification
- Music tagging
- Text summarization
Whoa, that’s lot of examples.
Example
Recognizing a rose:
- Start by color
- Try shape
- … It gets complicated fast
Rather than describing how a rose looks like programatically, we will describe rose emperically.
– Gaurav Kapoor
Two Steps to Machine Learning
- Training
- Inference
Training a Model
- Collect sample data (“roses, sunflowers, lillies”)
- Pass through a learning algorithm
- Generate a model
Inference
- Pass an image into the model
- Get back a result and confidence level.
Challenges:
- Prove correctness
- Performance
- Energy Efficiency
Three frameworks: Vision & NLP sit on top of CoreML.
CoreML is built on top of Accelerate and MPS.
- Domain agnostic
- Inputs: Images, text, dictionaries, raw number.
- Accelerate is good for math functionality.
Advantages to Running Locally
- Privacy
- Data Cost
- No server cost
- Always available
Real Time Image Recognition
- No latency
Overview
- Xcode integrations
Models
A model is a function that “happens to be learned from data.” Each takes an input and gives a an output.
Neural Network Types:
- Feed Forward Neural Networks (image/video)
- Convolutional Neural Networks
- Recurrent Neural Networks (text based applications)
- Tree Ensembles
- Support Vector Machines
- Generalized Linear Models
Focus on the use-case and let CoreML handle the details. Models are single documents.
- Inputs, types, outputs
- Structure of neural network
- training parameters
Where do models come from?
- Developer.apple.com has some ready-to-use models.
- The machine learning community: – Caffe – Keras – dmlc XGBoost – scikit learn – turi – libsvm
For converting data to CoreML format, use Apple Core ML Tools Python package.
Development Flow
- Collect data
- Train model
- Drag model into Xcode.
Xcode shows name, filesize, author, license, inputs and outputs. It also generates Swift code asynchronously, for loading and predicting against the model.
-
Model sizes: How does compression work?
- Type of file is abstracted.
- Strongly typed inputs.
Generated Source
- Input, output, and classifier classes.
- Offers access to the underlying
MLModel
for programatic access. MLModel
has anMLModelDescription
and another conformance-based (?) prediction method.MLModel
is JSON based.
Core ML Depth
- CoreML provides a functional abstraction for machine learning models.
Types of CoreML Inputs
- Numeric:
Double
,Int64
- Categories:
String
,Int64
- Images:
CVPixelBuffer
- Arrays:
MLMultiArray
(New type - why?) - Dictionaries:
[String: Double]
,[Int64: Double]
Working With Text
Sentiment analysis example: Takes text and passes to the model and the model returns an emoji (happy/ok/sad.)
- Approach: Operates as word counts.
-
NSLinguisticTagger to tokenize and count words
- A “Pipeline Classifier” does a few things before returning a prediction. Takes a dictionary, returns a sentiment label, and sentiment scores between 0 and 1.