Deploying Models§
Info
Apple users: Most models won't run on Apple Silicon (ARM64) processors, more information below.
Check the list of available services to see which ones are compatible.
Deployment Options§
There's different ways to deploy a service.
- Deployment via container + compose.yaml
Deploy your model locally or in the cloud, using Docker or Podman compose
Recommended (when available) - Deployment via container
Deploy your model locally or in the cloud, using Docker or Podman build - Local deployment using a Python virtual environment
Install requirements and manage your own environment - Cloud deployment to Google Cloud Run
Leverage cloud computing power, using a no-code deployment
Easy setup - Cloud deployment to Red Hat OpenShift
Leverage cloud computing power - Cloud deployment to SkyPilot on AWS
Leverage cloud computing power
Deployment via Container + compose.yaml Recommended§
When available, containerizing a model using Docker Compose / Podman Compose is the easiest (and our recommended) way to deploy a model service.
Support§
To check if a service supports compose, check the service's details in our list available services, or look in the service's GitHub repo if a compose.yaml
file is present.
If you're running on an Apple computer, make sure to check the Apple Silicon section below.
Preparation§
Create the .openad_models
folder in your home directory.
Build and Start§
Note
Chosing a port: Before you start, consider the port you want to run the service on.
By default, 8080:8080
maps host port 8080 to container port 8080.
If you will be running multiple service, you may want to change the host port in the compose.yaml
, eg. 8081:8080
First build the container image:
Next, start the container:
Note
If your device does not have a descrete GPU (as is the case for Apple Silicon devices), the start
command will fail with the following error:
In this case, you can simply disable this part of the compose.yaml
instructions:
Then run the create
and start
commands again.
Once the service is running, continue to Cataloging a Containerized Model.
Deployment via Container§
Prerequisites§
Make sure you have Docker and the Docker Buildx plugin installed.
Then create the .openad_models
folder in your home directory.
Set up Container§
Clone the model's GitHub repository:
Set the working directory to the downloaded repository and run the build:
Note: The
<model_name>
you can choose yourself.
Note
Apple users: If you're running on Apple Silicon, you'l need to add --platform linux/amd64
to the build command, to force the AMD64 architecture using an emulator.
After the build is complete, run the container and make the server available on port 8080:
Once the service is running, continue to Cataloging a Containerized Model.
Cataloging a Containerized Model§
Once the container is running, you can catalog the model in OpenAD:
First, lauch the OpenAD shell:
Then catalog the service, and check the status.
If all goes well, the status should say Connected
Service Status Endpoint Host API expires
-------------- ----------- ---------------------- ------ -------------
<service-name> Connected http://127.0.0.1:8080/ remote No Info
As a reminder, to see all available model commands, run:
Local deployment using a Python virtual environment§
Info
If you are using an Apple Silicon device, deploy using Docker instead. See Apple Silicon for more info.
-
Create a virtual environment running Python
3.11
. We'll use pyenv.Note: Python
3.10.10+
or3.11
are supported, Python3.12
is not. -
Install the model requirements as described in the model's repository (not to be confused with the OpenAD service wrapper). Models not listed below should deploy with Docker instead.
Model Name GitHub SMI-TED github.com/IBM/materials BMFM-SM github.com/BiomedSciAI/biomed-multi-view BMFM-PM github.com/BiomedSciAI/biomed-multi-alignment REINVENT github.com/MolecularAI/REINVENT4 -
Install the OpenAD service utilities:
Note
Downloading of the models will be prompted by your first request and may take some time.
You can pre-load the models using AWS CLI.
-
Clone the service repo:
-
Move the working directory into the cloned service repo:
-
Start the service:
The start command differs per model:
-
Open a new terminal session and launch OpenAD:
-
Within the OpenAD shell, catalog the service you just started:
-
To see the available service commands, run:
Cloud deployment to Google Cloud Run§
Deploying a model to Google Cloud Run is the easiest way to deploy a model to the cloud, as Google Cloud can spin up the service directly from the GitHub repository using only the Dockerfile.
Step 1: Preparation§
- Install Google Cloud CLI
- Go to Available Models to verify if the model you want to deploy supports Google Cloud
- Fork the GitHub repo of the model you want to deploy.
Step 2: Google Cloud Run§
- Go to console.cloud.google.com/run
- Create a project
- Start your free trial if you need to (credit card required)
- Click the "Create service" button in an empty project
(or click "Deploy container" and select "Service") - Select second option: "Continuously deploy from a repository"
- Click "Setup with Cloud Build"
- Connect GitHub & install Cloud Build for your fork
- Under "Build Type" select "Dockerfile" (keep default location)
- Choose "Require authentication"
- Click "Container(s), Volumes, Networking, Security" at the bottom
- Under Resources, set Memory & CPU to 32GB & 8CPU or higher
- Under Requests, set the Request timeout to the maximum of 3600 sec
- Click "Create"
-
Copy the service URL on top (we'll refer to it as
<service_url>
)
It should look something like this:Note: Your service will have a green checkmark next to its name, but it won't be available until "Building and deploying from repository" is done and also displays the green checkmark.
Step 3: Terminal§
-
Login to Google Cloud
Note: The
application-default
clause stores your credentials on disk and is required to auto-refresh your auth tokens. -
If you haven't done so yet, you may have to set your project.
Note: To find your project's ID, go to the Google Cloud Console and click the button with your project's name (probably 'My FIrst Project') next to the Google Cloud logo.
-
Fetch your auth token
-
Copy the token
Step 4: OpenAD§
-
Create auth group for Google Cloud
Note: You can call this group anything, we'll call it gcloud
-
Catalog the service
-
Service will be listed as "Not ready" until the build is done
(~15-20 min depending on the model)
Cloud deployment to Red Hat OpenShift§
If the service you're trying to deploy has a /helm-chart
folder, it's been prepared for deployment on OpenShift.
Note: The
<service-name>
and<build-name>
you can choose yourself. We'll show an example for SMI-TED.
-
Install the helm chart
-
Start a new build
-
Wait for the build to complete
-
Run a request test
Cloud deployment to SkyPilot on AWS§
Setting up SkyPilot§
-
AWS account
- Head to aws.com
- Click the [Create an AWS Account] button in the top right corner
- Follow instructions, including setting up a root user
-
AWS user with correct permissions
Starting from your AWS dashboard:
- Search for "IAM" in the search bar
- From your IAM dashboard, click "Users" in the lefthand sidebar
- Click the [Create user] button in the top right hand corner
- Leave the "Provide user access to the AWS Management Console" box unchecked
- Up next on the "Set Permissions" screen, select the third option: "Attach policies directly"
- In the box below, click the [Create policy] button
- Create a new policy with minimal permissions for Skypilot, following thye Skypilot instructions
- On the next screen, search for the policy you just created, which would be called
minimal-skypilot-policy
per the instructions - Finish the process to attach the policy to your user
-
AWS Access key
Starting from the IAM dashboard:
- Click "Users" in the lefthand sidebar
- Click on the user you created in the previous step
- Click "Create access key" on the right side of the summary on top
- Select the first option, "Command Line Interface (CLI)" as use case
- Finish the process to create the access key
- Store the secret access key in your password manager, as you will not be able to access it after creation
-
AWS command line tool
Starting from a terminal window:
-
Install awscli
Note: For more nuanced instructions, please refer to pypi
-
Add the credentials for the AWS user you set up in step 3.
- Your user's access key can be found in your IAM dashboard > Users, however the secret access key should have been stored in your password manager or elsewhere.
- The fields "Default region name" and "Default output format" can be left blank
-
-
SkyPilot
SkyPilot is a framework for running AI and batch workloads on any infrastructure. We're using AWS.
Starting from a terminal window:
-
If you are running OpenAD in a virtual environment, make sure your virtual environment is activated. If you followed the default installation instructions, you should be able to run:
-
Install Skypilot for AWS
-
After installation, verify if you have cloud access
-
Apple Silicon§
Apple Silicon chips (aka M1, M2, M3 etc.) utilize the ARM64 instruction set architecture (ISA), which is incompatible with the standard x86-64 ISA the models are compiled for.
Some are able to run on Apple Silicon via emulator (with some impact on performance), however support is not consistent.
Also, because Apple M processors are a SoC without discrete GPU, there is no support for GPU deployment. When using Docker or Podman compose, make sure to disable this part in the compose.yaml
file:
If you will be using our models in a production environment, it's recommended to deploy the container to the cloud.