Introduction
Whether you’re new to the data industry or an experienced professional, it’s important to note that ML models can experience up to a 20% performance drop in their first year. Monitoring these models is crucial due to challenges such as data changes, concept alterations, and data quality issues. ML Monitoring helps in early detection of model performance dips, data quality issues, and drift problems as new data streams in. This prevents failures in the ML pipeline and alerts the team to resolve the issue. Evidently.ai, an open-source tool, simplifies ML Monitoring by providing pre-built reports and test suites to track data quality, data drift, and model performance. In this beginner’s guide to ML Monitoring with Evidently.ai, you’ll learn effective methods to monitor ML models in production, including monitoring setup, metrics, integrating Evidently.ai into ML lifecycles and workflows, and more.
Learning Objectives
- Apply statistical tests to detect data quality issues like missing values, outliers, and data drift.
- Track model performance over time by monitoring metrics like accuracy, precision, and recall using Evidently’s predefined reports and test suites.
- Create a monitoring dashboard with plots like target drift, accuracy trend, and data quality checks using Evidently’s UI and visualization library.
- Integrate Evidently at different stages of the ML pipeline – data preprocessing, model evaluation, and production monitoring – to track metrics.
- Log model evaluation and drift metrics to tools like MLflow and Prefect for a complete view of model health.
- Build custom test suites tailored to your specific data and use case by modifying its parameters.
This article was published as a part of the Data Science Blogathon.
Understanding ML Monitoring and Observability in AI Systems
ML Monitoring and Observability are essential components of maintaining the health and performance of AI systems. Let’s delve into their significance and how they contribute to the overall effectiveness of AI models.
ML Monitoring
We need ML Monitoring to track the behavior of candidate models, compare multiple candidate models (A/B tests), and monitor the performance of the production model. It involves monitoring the service layer, data and model health layer, and KPI metrics specific to the business.
ML Observability
ML Observability is a superset of ML Monitoring, focusing on understanding the overall system behavior and finding the root cause of issues that occur. Both monitoring and observability help in analyzing, retraining, and resolving issues in AI systems.
Key Considerations for ML Monitoring
- Create an ML Monitoring setup based on specific use cases.
- Choose model re-training strategies according to the use case.
- Utilize a reference dataset for comparison with batch datasets to identify data drift.
- Create custom user-defined metrics for monitoring tailored to the business needs.
ML Monitoring Architecture
ML Monitoring involves collecting data and performance metrics at different stages, including backend monitoring, batch monitoring, real-time monitoring, alerts, reports, and dashboards. Key metrics for monitoring include Model Quality, Data Quality, and Data Drift.
Evaluation of ML Model Quality
To evaluate model quality effectively, utilize custom metrics in addition to standard metrics like precision and recall. Early monitoring is crucial for adapting to a volatile environment and changing target variables.
For regression example:
# Python code example for regression
# Install Evidently
pip install evidently
# Import necessary libraries
import numpy as np
import pandas as pd
from sklearn import ensemble
from sklearn import datasets
from evidently.report import Report
from evidently.metric_preset import ClassificationPreset, RegressionPreset
from evidently.metrics import *
# Create datasets for comparison
# Run Evidently metrics
For classification example:
# Python code example for classification
# Load datasets
# Create binary classification problem
# Split dataset into reference and current data
# Run Evidently metrics