LIBRARY

Introduction to Machine Learning Operations (MLOps)

November 11, 2024

12 MIN READ

As artificial intelligence and its applications become more prevalent in digital products and services, it is crucial to develop appropriate methods and studies for the verification, implementation, management, and monitoring of these models in production environments.

Essentially, with the practice of MLOps, the goal is to minimize technical debt in Machine Learning implementations by addressing the challenges associated with maintaining, scaling, and updating ML systems.

Introduction to MLOps
Differences Between MLOps and DevOps
Key elements of MLOps

Introduction to MLOps

MLOps is an operational approach designed to simplify the process of deploying an experimental model into a production environment and managing it effectively. Based on DevOps practices, it treats machine learning (ML) models as reusable software artifacts.

Among all the improvements that adopting this approach can bring, here are two that has a more directly impactful on business value:

Adaptation to Change – It can be difficult to keep up with the evolution of business requirements. The data flowing into the models can change constantly. Therefore, having a system that is easy to maintain for such cases is essential in various situations.
Collaboration – The inability to find common ground for communication among all stakeholders in a project, including data scientists, operations directors, and business leaders, can be a significant obstacle to AI initiatives.

Differences Between MLOps and DevOps

MLOps builds upon the foundational practices of DevOps, sharing many similarities across various stages of their frameworks. However, it also presents several important differences that are worth highlighting:

Experimentation – ML teams need to adjust hyperparameters, data, and models while tracking their experiments to ensure reproducible results in their respective labs.
Testing – In an MLOps system, it is necessary to train, test, and evaluate the models to identify their performance. This makes testing more complex, as each step must be validated and meticulously documented. Here’s a comparison of the tests typically performed in both practices:

Machine Learning (ML) systems require extensive testing and monitoring.

Automated Deployment – ML models require more coordination and automated processes for deployment. This necessitates a multi-step pipeline that can retrain the model, evaluate it, and monitor its performance in production.
Production Monitoring – It is expected that ML models will experience performance degradation. Models often perform worse in production than during training due to differences between the training data and the inputs of new data.
CI/CD/CT – In ML, the pipeline must address additional concerns, including data validation, data schemas, models, and their performance. Thus, Continuous Testing (CT) is a new element in the MLOps pipeline, which automatically retrains and delivers models based on inputs from the production environment.

key elements of MLOPS

In this introductory post on MLOps, it’s essential to outline and explore the key elements that make up its framework, setting the stage for a deeper understanding of how MLOps integrates machine learning with DevOps practices:

Versioning

The ability to replicate the same code base so that multiple people can work on the same project simultaneously is a great benefit.

“To ensure that the experimentation of the data science team leads to a production model for the project, it is important that key factors are documented and reusable.”
(“Introducing MLOps,” Treveil and Dataiku Team)

Therefore, what should be properly versioned:

- Assumptions: The decisions and assumptions of the Data Scientist must be explicit.
- Randomness: It needs to be under some type of control to ensure reproducibility, for example, by using a “seed.”
- Data: The same data from the experiment must be available.
- Configurations: Repeat and reproduce experiments with the same configurations as the original.
- Environment: It is crucial to have the same runtime configurations among all data scientists.

Feature Store

The goal of a Feature Store is to process data from various data sources simultaneously and transform it into features that will be consumed by the pipelines.

“Feature Stores are reliable tools for managing features for research and training using an Offline Store, as well as to manage the feeding of features for a model served in production using an Online Store.”
(“Introducing MLOps,” Treveil and Dataiku Team)

Offline Store: A store composed of pre-processed batch data features, used to build a historical source of features that can be utilized in the model training pipeline.
Online Store: A store made up of data from the Offline Store combined with real-time pre-processed features from streaming data sources.

Automation

The automation of machine learning pipelines is highly correlated with the maturity of the project. There are several stages between model development and deployment, and much of this process relies on experimentation.

MLOps Level 0: Manual Process: A complete experimentation pipeline executed manually using rapid application tools.
MLOps Level 1: ML Pipeline Automation: Automation of the experimentation pipeline, including data and model validation.
MLOps Level 2: CI/CD Pipeline Automation: Automatically build, test, and deploy ML models and ML training pipeline components.

- CI (Continuous Integration): Whenever code or data is updated, the ML pipeline is re-executed. This is done in a way that everything is versioned and reproducible.
- CD (Continuous Deployment): Continuous deployment is a method for automating the deployment of the new version to production or any environment, such as testing.

Monitoring

The performance of machine learning models can vary over time. Therefore, once a model is deployed, it needs to be monitored to ensure it operates as expected. But what should be monitored?

Model Monitoring

Some of the key points to focus on are the components and concepts of:

- Performance: Based on a set of metrics.
- Data: Inconsistencies or errors due to various transformations.
- Explainability: Being able to explain a model’s decision is vital.
- Bias: Monitoring for undesirable biases or trends in the model’s outputs
- Drift: Properties may change over time, leading to concept drift.

Tools

Finally, there are numerous tools available to support the development of systems based on MLOps architectures. Here are some of the most commonly used ones:

In Azure, for example, you can leverage Azure Machine Learning, which provides an integrated environment for managing the complete lifecycle of machine learning models, including building, training, deploying, and monitoring. Azure DevOps is also an excellent choice for CI/CD automation specific to ML projects.
In AWS, you can use Amazon SageMaker, a comprehensive solution that facilitates the building, training, and deployment of machine learning models at scale. AWS CodePipeline and AWS CodeBuild can also be configured to automate training and deployment steps for models.
In Google Cloud, Vertex AI stands out as a unified platform for developing machine learning models, encompassing everything from data preparation to model deployment. Lastly, GCP also offers Cloud Build for CI/CD automation, which integrates seamlessly with other Google Cloud services for effective ML lifecycle management.

Programmers Editorial team

The Programmers Editorial Team brings together experienced writers, marketers, and technology specialists to share insights, industry trends, and practical knowledge from our team of experts.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

LIBRARY

November 11, 2024

Introduction to MLOps

Differences Between MLOps and DevOps

key elements of MLOPS

Versioning

Feature Store

Automation

Monitoring

Tools

AI closes in on modernization

Applying AI to Business

The Rise of Databricks: Powering Data Engineering and Preparing for the Future of AI

AI closes in on modernization

Applying AI to Business

The Rise of Databricks: Powering Data Engineering and Preparing for the Future of AI

Programmers Beyond IT Joins AI2030: Advancing the Future of Responsible AI

Building Effective AI Solutions: A Practical and Proven Approach for Success

Programmers AI Canvas: How to Use it

Accelerating Businesses Output with AI Agents

Leveraging Generative AI with RAG Architecture and Enterprise Data

The Importance of MLOps in Successful Machine Learning

Data Engineering: The Backbone of AI and Generative AI Success

Follow Us

Chicago

Brazil

Get Programmers News & InsightS