A key avenue for deploying deep learning models is a mobile device. The advantage of running models in mobile apps instead of sending them to the cloud is the reduction in latency and the ability to ensure data privacy for users. Despite the variety of deep learning libraries and AI tools, successfully embedding a deep learning model into a mobile app can be challenging. Anirudh Koul, head of AI and research at Aira explained the step by step process for deploying convolutional neural networks (ConvNets) in applications for mobile devices during his lecture at ODSC West 2018.
[Related Article: Machine Learning Approaches to Mobile Sensing Data to Make Self-Driving Cars Safer]
Anirudh explained the process necessary to deploy a ConvNet application given a few time frames. With minimal time, say an hour, it is possible to simply use a cloud API hosted by Google, Amazon or Microsoft. The API allows a user to simply upload the photo and get predictions; there’s no modeling to be done. Anirudh breaks down the performance of cloud APIs by each company in the table below.
As mentioned earlier, cloud APIs suffer from latency associated with sending information to the cloud. If one only has a day to deploy the app, one can employ pre-trained models like ResNet50 or Inception without any training necessary. However, this approach is restricted to predicting the 1000 classes that the pre-trained models were designed to predict. Because all the predictions are to be made locally, one needs a platform to support the pre-trained model on a mobile phone. Anirudh recommends CoreML, TensorFlow Lite, and Caffe2 for this purpose. In general, Anirudh recommends using a Keras model, and either CoreML or TensorFlow Lite for iOS or Android apps respectively.
To make predictions specific to one’s own task, say determining the difference between dog breeds, one needs to train a ConvNet on images for one’s specific task. Training ConvNets requires a substantial amount of time for computation and experimental trials. Building a ConvNet can be done with varying degrees of customization, from simply training the last layer of a pre-trained model to training a full ConvNet with a custom architecture (i.e. model structure). Anirudh indicated that the amount of training necessary is dependent on the similarity of one’s dataset to the dataset used for a given pre-trained model, and the size of one’s dataset (details below).
Processing speed is a key feature in deploying ConvNets in real time on mobile devices. Some ConvNet architectures offer more efficient classification than others. For example, Google’s Inception-V3 offers slightly higher accuracy on ImageNet classes with substantially less computation time. It is worthwhile shopping around to find a cutting edge model that makes classifications similar to one’s task. Anirudh also mentioned a compelling technique for automating the search for the most efficient ConvNet architecture called Platform-Aware Neural Architecture Search. The technique uses reinforcement learning that penalizes models for high computation time.
[Related Article: How to Leverage Pre-Trained Layers in Image Classification]
Key takeaways:
- Two major components are necessary for a given deep learning application, the deep learning model and a platform for deployment.
- CoreML and TensorFlow Lite offer platforms that require minimal programming and directly support common machine learning libraries.
- Careful engineering of the model architecture is required for efficient predictions with minimal latency.
- The amount of training required depends on the similarity of one’s task to that of pre-trained models, and the size of one’s dataset.