Making ML Models Production-ready With “modelkit” — Our MLOps Python Library | by Cyril Le Mat | Apr, 2022

Photo by Atakan Narman on Unsplash

We have an open-sourced modelkit, a python MLOps framework, meant to make ML models reusable, robust, performant and easy to deploy in all kind of environments (cf. its story here).

In this tutorial, we will illustrate the power of Modelkit with a common NLP task: Sentiment Analysis.

Here is the plan:

  1. Implementing a Tokenizer leveraging spaCy
  2. Implementing a Vectorizer leveraging Scikit-Learn
  3. Building a Classifier leveraging Keras to predict whether a review is negative or positive
  4. Explore Modelkit’s feature to help make our model production-ready

Before we start, please install the following:

A first model

In this section, let’s cover the basics of modelkit‘s API, and use spaCy as tokenizer for our NLP pipeline.

Let’s build a firstmodelkit.Model :

The _predict method is straightforward: it implements the inference method.

The _load method is called at the object instantiation. Its goal is to load/compute any asset, artifact, and other complex objects needed by the model, for which modelkit offers features such as lazy loading and dependency management.

A complete model

Now that we understand the basics, let’s write a more advanced version of this model:

Let’s look at what we added:

  • Batching: we implemented a _predict_batch method to process a list of input at once in order to leverage vectorization for speedups (In this example, the time needed to tokenize batches of data is divided by 2).
  • Testing: We added test cases alongside the Model class definition to ensure that it behaves as intended (yes, we are compatible with pytest). Test here are meant to be simple checks and also serve as documentation.
  • Input and Output SpecificationBy subclassing Model[input_type, output_type]calls will be validated, thus ensuring consistency between calls, dependencies, services, and raising alerts when Models are not called as expected. This is also good for documentation, understanding how to use a given model, and during development to benefit from static type (eg with mypy). modelkit allows you to define the expected input and output types of your model.

We will now create a vectorizer and illustrate modelkit’s “assets” concept. Here we train a TfIdf Vectorizer using sklearn and storing it locally:

The output of the previous code is a self.vocabulary.txt file. we then create a model and define this file as an “asset”:

We see here that model can load the asset using self.asset_path. Why is modelkit useful? well it offers several features as:

  • remote storage: In real case scenarios, your assets will not be stored locally but accessible through file stores (eg: AWS s3, GCS, etc.). modelkit abtract this connection using env variables and can retrieve and cache assets on ta local disk before the _load. If you keep your assets file in your current dir for this tutorial, you can simply defineexport MODELKIT_ASSETS_DIR=.
  • push: modelkit has cli to push new assets to an asset store
  • versionning: modelkit can handle the versioning of your assets

Now let’s train a Keras Classifier using the IMDB dataset and our previous models. (As the code is a little long here, I sampled the most important part. If you want to run it locally, please go here)

Once the model is trained, let’s create a modelkit model that uses the previous output as an asset:

We now have a classifier which is :

  • composed: the classifier loads both the tokenizer and vectorizer. Modelkit makes sure only one instance of a class is created
  • synchronised with a file store: the store is local on the example, but using env_var asMODELKIT_STORAGE_PROVIDER can make you use any storage
  • tested: we have a first level of testing. Clearly it is not enough but is a great way to document your model
  • Fully typed: this is not free, as ensure typing takes time, but this helps a lot for code clarity and robustness
  • Optimized for batches: as we implemented our own batch method, we can easily optimize our code

The goal of modelkit tis o support models industrialisation. Here are a few ways modelkit helps you in this direction

Model load describe

you can calldescribe() method of a library to see many information about the load time and memory

predict profiling

is built in order to profile the net duration of all sub-model. The following code snippet shows the usage of SimpleProfiler.

which outputs:

caching

You can add caching to a model by adding t {"cache_predictions": True} to its configuration and setting the MODELKIT_CACHE_PROVIDER env variable ( native cache via cache tools or redis cache)

FastAPI serving

modelkit models can easily be added to FastAPI app:

multi-processing

You can extend the previous example to do multi-processing with modelkit as the ModelLibrary Ensures that all models are created before different workers are instantiated (eg using gunicorn --preload), which is convenient since all will share the same model objects and not increase memory.

asynchronous

This is the context in which modelkit‘s async support shines, be sure to use your AsyncModels here:

As we saw in this tutorial, lots can be done with modelkit. And the more advanced your model library is, the stronger it is.

As an illustration, our 10-people team has currently:

  • around 200 models, deeply interconnected (90% of the models are connected to others. Our dependency tree go to 5 levels deep)
  • around 100 assets (some models use no assets, some uses several), all versioned in AWS buckets all over the world (we deploy on many environments).
  • 3 distinct FastAPI services, each with a subset of around 50% of our models & we have a daily use of our models in Sparks scripts.
A (voluntary blurry) view of our graph of models and assets

If you want to know more about the project, please test it and join the discord to give us your thoughts!

Thanks for the help 😉

Want to Connect?PS: Huge thanks to the ModelKit team, composed of Victor Benichoux (that wrote most of ModelKit), Antoine Jeannot (who built this tutorial for the documentation), Thomas Genin, Quentin Pradet, Louis Deflandre, Lu Lin and Mathilde Léval

Leave a Comment