Onnx Machine Learning in Production
By Justin

I recently had a project that I needed to use PyInstaller along with a Keras-trained model. Unfortunately, PyInstaller and Keras only work some of the time.. as in, not that reliable of a build.
What to do?
Onnx was the solution.
My project didn't need to run training, it just need to run inference (prediction) and that's something onnx really excels at.
So, what is onnx?
On onnx.ai, it says: ONNX is an open format built to represent machine learning models. I read this as onnx is essentially a file format for ML models and onnx can run those models.
Almost any modern ML framework can become an onnx model. PyTorch -- yep. Tensorflow -- yep. Keras -- yep. Caffe2 -- yep. Scikit-learn -- yep.
Why is this important? interoperability
Let's say I have a flask web app in production serving a ML model. Doing so, is easy enough. Let's say the model I originally created this in was in Keras. Then, a new member joins my team that is just a pro at PyTorch. How do I deploy these two models on the same project?
Well, you could package up both tensorflow and PyTorch for running inference but that starts to make our simple web app a bulky one and, more importantly, one that's significantly more difficult to manage.
onnxA real-world example
Recently, I was working on a Python project that needed to be compiled into a single executable. For this, I used PyInstaller since it's a very reliable way to turn python into an executable.
Unfortunately, pyinstaller doesn't play nice will all packages and package types and containers (like docker) cannot (as far as I know) be compiled into a single binary.
I was having all kinds of trouble getting PyInstaller and tensorflow to compile correctly so I decided to give onnx a try. Not only did it work, but it worked incredibly reliably.
The post below will show you exactly how to convert a keras model into a onnx one and then put it into production.
Step 1. Virtual Environment
Whenever you build a Python project, use a virtual environment of some kind. For this guide, I'm solving a real-world problem I had with PyInstaller so I'm going to use the following:
- Python 3.7
- venv (and not my preferred pipenv)
bash
$ cd path/to/your/dev/folder
bash
$ mkdir cfe_onnx
$ cd cfe_onnx
$ python3.7 -m venv .
Activate
Mac/Linux
bash
source bin/activate
Windows
bash
.\Scripts\activate
Step 2. Installations
bash
pip install tensorflow keras2onnx onnxruntime numpy pillow
- tensorflow: our machine learning framework (but using tf.keras which is built-in to tensorflow now)
- keras2onnx: our conversion package
- onnxruntime: how we run inference on onnx models in production
- numpy: numerical python; common for dealing with arrays and matrices in Python & ML Projects
- pillow: the Python Image Library installer (PIL) which makes it easy to open images within python.
Step 3. Export a Keras Model to an Onnx Model
As outlined in the keras2onnx docs, I'm going to just be using a pre-trained Keras model for illustration purposes, change as needed:
python
from keras.applications.resnet50 import ResNet50
model = ResNet50(include_top=True, weights='imagenet')
model.save("model.h5")
Now run the conversion
python
onnx_model = keras2onnx.convert_keras(model, model.name)
keras2onnx.save_model(onnx_model, 'model.onnx')
After this is done, you will have to saved models:
- model.h5 (keras)
- model.onnx (onnx)
I recommend keeping a keras-saved model for future resumable training. onnx can be converted but I don't use the extra step if I don't need to.
Step 4. Prepare for Production
The above model is for Image Classification (aka imagenet), so we have to be sure we prepare our data prior to running inference.
Preprocessing
python
# preprocessing.py
import numpy as np
from PIL import Image
def process_image(image_path, height=150, width=150):
'''
This method opens an image and converts it into a normalized
array that represents the image.
'''
image = Image.open(image_path)
image = image.convert("RGB")
new_image = image.resize((width,height))
np_image = np.asarray(new_image)
min = np_image.min()
max = np_image.max()
# normalize to the range 0-1
np_image = np_image.astype('float32')
np_image -= min
np_image /= (max - min)
return [np_image]
Response Encoding
Below is a json encoder that converts numpy data types.
python
# encoding.py
import numpy as np
import json
class NumpyEncoder(json.JSONEncoder):
""" Special json encoder for numpy types """
def default(self, obj):
if isinstance(obj, np.integer):
return int(obj)
elif isinstance(obj, np.floating):
return float(obj)
elif isinstance(obj, np.ndarray):
return obj.tolist()
return json.JSONEncoder.default(self, obj)
Here's a couple ways to use this encoder:
**With Json Dumps
python
data = {"preds": np.array([0.87, 0.13])}
json.dumps(data, cls=NumpyEncoder)
In Flask
python
app = Flask(__name__)
app.json_encoder = NumpyEncoder
@app.route('/numpy')
def get_numpy_response():
data = {"preds": np.array([0.87, 0.13])}
return jsonify(data)
Step 5. Predictions with Onnx
python
# predict.py
import json
import pathlib
import onnxruntime
from .encoding import NumpyEncoder
from .preprocessing import process_image
ONNX_SESSION = None
def get_session():
global ONNX_SESSION
if ONNX_SESSION == None:
model_path = str(pathlib.Path("model.onnx"))
sess = onnxruntime.InferenceSession(model_path)
ONNX_SESSION = sess
return ONNX_SESSION
def predict(img_path, use_array=False, *args, **kwargs):
onnx_sess = get_session()
sess_inputs = onnx_sess.get_inputs()[0]
input_name = sess_inputs.name
shape = sess_inputs.shape
im = process_image(img_path, height=shape[1], width=shape[2])
inference_preds = onnx_sess.run(None, {input_name: im}) # this is where the inference_happens
results = inference_preds[0][0]
data = {str(k):v for k,v in enumerate(results)}
return json.dumps(data, cls=NumpyEncoder)
Next Steps
Now, you just need to take all of the above information and turn it into a webapp or add it to a local python project. I'll leave that to you.
Good luck!