Machine Learning & MLOps
Tools, practices, and challenges in ML engineering and operations.
How many ML models do you currently have in production?
45% have 2–5 models; 25% have none and 18% have one. 12% have 5+. Many teams were still early in production in 2024–2025.
Which tools do you use for deploying ML models?
38% don't deploy models. Among those who do, Kubernetes and SageMaker (27% each) lead; Google AI Platform (22%) and Azure ML (18%) follow. TensorFlow Serving (10%) is next.
Do you use any tools to monitor ML models in production?
58% don't monitor models. Prometheus and Grafana (21%), custom scripts (11%), and ELK (9%) are the most used. Monitoring was a clear gap.
Which tools do you use for model training and experimentation?
55% don't use dedicated tools; MLflow (34%) leads among those who do. W&B (13%) and TensorBoard (10%) follow. Many relied on notebooks or scripts.
Which tools do you use for model or data versioning?
58% don't use versioning tools. MLflow (32%) leads; W&B (11%) and DVC (11%) have smaller shares. Versioning was under-adopted.
Which workflow orchestration tools do you use for ML pipelines?
54% don't use orchestration. Airflow (34%) dominates; Step Functions (8%), Kubeflow (7%), and Prefect (6%) follow. Orchestration was not yet widespread.
Which CI/CD tools do you use for ML workflows?
50% don't use CI/CD for ML. GitLab CI/CD (27%) and Jenkins (15%) lead. Traditional DevOps tools dominated over ML-native pipelines.
Do you use any feature stores?
75% don't use feature stores. SageMaker (12%), Databricks (11%), and Vertex AI (8%) lead among adopters. Feature stores were not mainstream.
How often do you retrain your models in production?
44% don't retrain; 29% retrain when needed and 23% on a schedule. Only 3% do continuous learning. Retraining was mostly reactive.
Where do you run your ML workloads?
AWS (40%), Azure (29%), and GCP (21%) lead; 21% use on-premise and 11% hybrid. Cloud dominated but on-prem and hybrid were still common.
How many people are in your ML team(s)?
45% have 1–5 people and 35% have 6–10. 10% have no dedicated team (0). Small teams were the norm; few had 11+ or 51+.
Do you have a centralized MLOps team?
81% don't have a dedicated MLOps team; ML operations were mostly distributed. Only 19% had a centralized team.
How would you describe your MLOps maturity?
35% have some production models but mostly manual; 30% are experiments-only. 28% have standardized deployment and monitoring; 7% advanced. Maturity was spread out.
For the ML/MLOps tools you use, how would you describe their role?
39% say experimental only; 31% use them regularly but not critically, and 30% say mission-critical. Tools were either critical or still in exploration.
Which ML or MLOps tools do you plan to adopt or expand in the next 12 months?
MLflow (23%) and Airflow (18%) lead adoption plans; Kubernetes/Docker (12%), feature stores (10%), and monitoring (9%) follow. Plans were diverse.
What are your biggest challenges in ML engineering and MLOps?
Deployment complexity (60%) and lack of skills (50%) are the top two. Monitoring (42%), data quality (35%), and scaling (33%) follow. Same pain points as today.