Advanced MLOps Techniques with GitHub Actions
MLOps and GitHub Actions series #4 | Learn ins and outs of MLOps
Welcome back to another article for Introduction to MLOps with GitHub Actions series! This series aims to be very beginner-friendly with low code for those who just wants to learn MLOps principles and apply it simply with GitHub Actions.
Please ensure you have read previous articles here.
In our previous articles, we covered the basics of:
MLOps principles: When do we need it and why?
Setting up your GitHub repository for MLOps
Implementing CI/CD pipelines with GitHub Actions to automate tasks such as testing and model training.
In this article, we'll explore advanced MLOps techniques and how to leverage GitHub Actions to implement them effectively.
Conditionally Deploy the Model
It is common practice to set conditions to determine whether the model is ready for deployment. After all, we do not want to automate deploying an ML Model that has low accuracy and precision! GitHub Actions has an if
attribute we can specify to run an action based on certain conditions.
For simplicity of this tutorial, let's assume after training the model, its accuracy data is saved in the repository as accuracy.txt
. Our condition to deploy the model is if the accuracy of the model is above 90%. Here is how we will write our deploy-model
job.
Example of Conditionally Deploying the Model
jobs:
deploy-model:
runs-on: ubuntu-latest
needs: test-model
steps:
- name: Checkout repository
uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: 3.9
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Read accuracy
id: read-accuracy
run: echo "::set-output name=accuracy::$(cat accuracy.txt)"
- name: Deploy model # only deploy if accuracy is above 90%
if: steps.read-accuracy.outputs.accuracy > 0.9
run: python deploy.py
Periodical Model Retraining and Updating
Models deployed in production often need to be retrained and updated periodically to maintain performance and adapt to changing data distributions.
We can use GitHub Actions to automate the process of monitoring model performance and retraining models. For example, we can use the cron
action under the on
attribute to trigger a workflow periodically.
Let's say we want to run the workflow bi-weekly. The cron expression would be written as: 0 0 1,15 * *
Here's a breakdown of how the cron expression works:
0
(minute): At minute 00
(hour): At hour 0 (midnight)1,15 (day of month): Every 1st and 15th day of the month
*
(month): Every month*
(day of week): Every day of the week
So, this schedule triggers the workflow biweekly at midnight UTC
Example for Periodical Model Retraining
name: Model Retraining
on:
schedule: # At midnight on the 1st and 15th of each month
- cron: '0 0 1,15 * *'
jobs:
retraining:
runs-on: ubuntu-latest
steps:
- name: Checkout Repository
uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: 3.x
- name: Install Dependencies
run: pip install -r requirements.txt
- name: Retrain Model
run: python src/retrain.py
A/B Testing and Model Experimentation
A/B testing is a common technique used to evaluate the performance of different versions of a model in production. We can use GitHub Actions to automate the process of deploying and monitoring multiple model versions simultaneously, enabling A/B testing and experimentation to optimize model performance and user experience.
This is done by creating 2 jobs and allowing them to run in parallel. For example, we create a job called deploy_model_a
to represent model A's deployment and deploy_model_b
for model B. Below is how we can implement it.
Example for A/B Testing a Model
name: A/B Testing
on:
push: # triggers whenever there's a push on main branch
branches:
- main
jobs:
deploy_model_a: # represents deploying model A
runs-on: ubuntu-latest
steps:
- name: Checkout Repository
uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: 3.x
- name: Install Dependencies
run: pip install -r requirements.txt
- name: Deploy Model A
run: python src/deploy_model_a.py
deploy_model_b: # represents deploying model B
runs-on: ubuntu-latest
steps:
- name: Checkout Repository
uses: actions/checkout@v2
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: 3.x
- name: Install Dependencies
run: pip install -r requirements.txt
- name: Deploy Model B
run: python src/deploy_model_b.py
Conclusion
By leveraging GitHub Actions, data scientists and machine learning engineers can implement advanced MLOps techniques such as conditional deployment, model retraining, and A/B testing seamlessly within their GitHub repositories.
Automation and integration are key to achieving efficiency, scalability, and reliability in machine learning workflows. I hope you have found this article helpful in exploring some MLOps techniques via GitHub Actions!
In the next article, we'll delve into best practices and optimization tips for MLOps with GitHub Actions. Stay tuned for more insights and practical examples as we continue our journey into the world of MLOps with GitHub Actions. Cheers!