Continuous Training of Machine Learning Models in Production

Is continuous training (CT) a machine learning operations (MLOps) best practice? It depends on what we mean by CT. Suppose it means continuously invoking training pipelines to ensure models ‘stay fresh’ as new production data lands in the data lake. The training pipeline workflow could be executed automatically on schedule once per month, once perContinue reading “Continuous Training of Machine Learning Models in Production”

Unit Testing Data Validation Microservices for Production ML Pipelines

Unit testing is a vital element of production software engineering. After all, how do we know for sure that our code always returns the expected result regardless of input? Unit testing is especially important in production machine learning because model training and pre-processing functions do not always throw exceptions when they should. Instead, the errorsContinue reading “Unit Testing Data Validation Microservices for Production ML Pipelines”

Testing ML Microservices for Production Deployments

How do we ensure machine learning pipeline components produce the exact result we expect, especially prior to production deployments? We could sanity check by inspecting a few output records by hand, but how do we know for sure that all output records are correct every time? This manual, stage 1 automation “ClickOps” approach is not scalable, consistent,Continue reading “Testing ML Microservices for Production Deployments”

How To Drive Revenue Growth Through Production ML Solutions

For any organization, 20% of the AI/ML use cases drive 80% of the business value. How do we identify this 20%? Always start with specific business outcomes. Forget about machine learning at the beginning and let such a solution (if any) emerge naturally out of a systematic discovery process. In 2021, my team and IContinue reading “How To Drive Revenue Growth Through Production ML Solutions”

3 Degrees of Automation for Production Machine Learning Solutions

Have you released a machine learning solution to production, only to find yourself pulling KPI metrics manually every day to keep updating stakeholders on results? Or, have you found yourself manually updating Lambda code in the AWS console to quickly fix a production bug a few hours into the release? Both of these common scenariosContinue reading “3 Degrees of Automation for Production Machine Learning Solutions”

How To Deploy Serverless Containers For ML Pipelines Using ECS Fargate

“Should we use Kubernetes or go serverless first for new software solutions?” This is a common question among technology teams across the world. Based on a recent LinkedIn survey, the answer seems to be an event split between the two approaches, with most people flexible based on the project. Common arguments in favor of Kubernetes includeContinue reading “How To Deploy Serverless Containers For ML Pipelines Using ECS Fargate”

5 Pillars of Architecture Design for Production ML Software Solutions

Creating a machine learning software system is like constructing a building. If the foundation is not solid, structural problems can undermine the integrity and function of the building. MLOps considerations, such as systematically building, training, deploying, and monitoring machine learning models, are only a subset of all the elements required for end-to-end production software solutions.Continue reading “5 Pillars of Architecture Design for Production ML Software Solutions”

Lifecycle of ML Model Deployments to Production

What does it mean to deploy a machine learning model to production? As technology leaders, we invest in data science and machine learning engineering to improve the performance of the organization. Fundamentally, we are solving business problems systematically through data-driven technology solutions. This is especially true when the problem is recurring at scale and mustContinue reading “Lifecycle of ML Model Deployments to Production”

Custom ML Model Evaluation For Production Deployments

My team and I built a cloud-native recommender system that matches open jobs and people who are looking for work. We trained machine learning models to power the system, following the tried-and-true process: Set up an end-to-end data science workflow in a Jupyter notebook Use domain knowledge to create the feature space through feature engineeringContinue reading “Custom ML Model Evaluation For Production Deployments”

Microservice Architecture for Machine Learning Solutions in AWS

Why adopt a microservice strategy when building production machine learning solutions? Suppose your data science team produced an end-to-end Jupyter notebook, culminating in a trained machine learning model. This model meets performance KPIs in a development environment, and the next logical step is to deploy it in a production environment to maximize its business value.Continue reading “Microservice Architecture for Machine Learning Solutions in AWS”