Wiki
Data Science Project Management
How DataTalks.Club guests manage data science, analytics, and ML projects through problem framing, scope, stakeholders, baselines, metrics, evaluation, adoption, and production handoff.
Related Wiki Pages
Data science project management is the operating discipline that turns an ambiguous business, analytics, or machine learning request into useful shipped work. A data science project manager or data lead turns the decision into a measurable target. They keep the smallest useful version explicit, plan the shipping path, and name the handoff owner.
The practice draws from Data Science, Business Skills for Data Professionals, Data Product Management, and Machine Learning System Design. It also depends on Leadership and Data Science for Managers, because the work combines technical uncertainty with team coordination.
DataTalks.Club guests treat data science project management as technical work and organizational work. In CRISP-DM, teams start by understanding the business and preparing data. They model and evaluate next. They deploy when the result is ready to leave analysis.
In From Project Manager to Data Scientist, Ksenia Legostay treats planning and stakeholder communication as useful after the work moves into analytics and machine learning. KPI work stays useful too.
Project Lifecycle
Across the podcast episodes, guests mostly agree that project management for data science starts before modeling and ends after the first analysis or model result. The manager or lead asks what business objective the work serves and whether the problem is measurable. They also ask what data exists, which baseline is good enough, how the result will be used, and what operational owner receives the handoff. In CRISP-DM, the 13:25 section frames the problem as important, measurable, and connected to a way to measure success. The 17:05 and 18:23 sections keep baselines, evaluation, and business objectives together rather than treating modeling as an isolated phase.
That definition also appears in data product work. In Building Data Products at Scale, Ioannis Mesionis describes an operating model with intake, prioritization, and Definition of Done. KPIs and feasibility checks come before pilots. Later work includes A/B tests and rollout. Monitoring, demos, and stakeholder feedback also stay in the lifecycle (14:00-41:33).
That project structure links data science management to Data Products, Data Product Adoption, Evaluation, and Metrics.
Risk Emphasis
The guests don’t disagree that data science projects need structure. They focus on different risks.
Ksenia Legostay focuses on transferable project-management craft. Around 22:32 in From Project Manager to Data Scientist, she treats planning, stakeholder communication, and business KPIs as strengths that transfer into data work. Around 30:20, she names CRISP-DM as a useful project framework. Around 41:07, she adds Git and testing. She also adds Docker, deployment, and clean code because a project that affects other people can’t remain only a notebook.
Ioannis Mesionis focuses on lifecycle control. His lead data scientist role is embedded with marketing stakeholders. The work still goes through a single front door, Definition of Done, and feasibility checks. It then moves through sprint or Kanban delivery, pilots, A/B testing, and production rollout (7:23-27:25).
That view is close to Data Product Management. The project isn’t complete until the product can be used, measured, and operated.
Shir Meir Lador focuses on uncertainty management in Data Science Management and Agile Machine Learning. Her management discussion covers roadmaps, debrief culture, and business impact. It also covers cross-functional partnerships, exploration sprints, design stories, and incremental movement from POC to production (9:18-17:23 and 41:06-54:59). That focus belongs with Data Teams and Data Team Lead Role. The project manager protects learning speed and delivery discipline at the same time.
Framing and Scope
The first management task is to turn a request into a decision. A request for a forecast or dashboard usually hides more than one question. So does a request for a model or segmentation. The manager has to identify who will act, what will change, what cost matters, and what answer would be good enough.
In CRISP-DM, the 7:55, 10:58, and 13:25 sections move from an online classified-site example to measurable problem size and success criteria. Project planning starts there, before anyone chooses a model.
Project managers should include non-goals and a smallest useful path. Valeriy Babushkin makes that explicit for ML System Design Documents in ML System Design Playbook. At 7:06 and 14:36, design documents help teams fail early and align stakeholders. At 19:01, the document remains alive as the system changes.
For project managers, this means scope isn’t a fixed wish list. It’s a written agreement about the decision, assumptions, risks, and next review point.
The team also decides whether the answer should be analysis, analytics engineering, a model, or a productized ML system. Anna Hannemann gives that product-owner judgment in Product Owners in Data Science. The right next step may be manual cleanup, an MVP, or staged investment rather than a model (44:48-53:09). Use Data Product Owner vs Data Product Manager when the scope question is about who owns the delivery and product decision. For a role-focused learning path, the Machine Learning Engineer Roadmap shows how this scope work connects to production ML responsibilities.
Stakeholders and Decision Rights
Data science projects fail when stakeholders agree to a title but not to a decision path. Loris Marini starts Business Skills for Data Professionals in SaaS with shared meaning for words such as customer, usage, and churn. Around 25:53 and 35:20, he ties trust to active listening, stakeholder mapping, and recording roles and context. That’s project infrastructure, not presentation polish.
Mesionis gives the delivery version. His marketing data science team uses weekly embedded meetings and stakeholder observation before formal intake (7:23-15:23). Later, the team invites stakeholders to demos rather than daily stand-ups. They also simplify technical results for non-technical audiences (35:38-41:33). The demos keep stakeholders close to direction and feedback while the delivery team keeps space for exploration and technical work.
For managers, decision rights are part of team design. Barbara Sobkowiak separates data science managers from experts in Data Science Manager vs Expert. Around 8:22 and 15:49, the manager needs enough technical literacy and strategy to redirect work when good enough is enough. Around 31:56 and 34:04, she warns that companies sometimes hire deep experts when the real gap is coordination, translation, and team development. That distinction links project management to Data Scientist Role and Leadership.
Baselines, Metrics, and Definition of Done
Baselines make progress visible before the final model exists. In CRISP-DM, the 17:05 section treats a sufficient baseline as a reason to move to evaluation. In Machine Learning System Design Interviews, Valeriy connects baselines and metrics to system design around 24:28. Around 46:02, he adds A/B testing, monitoring, and fallbacks. Project managers should therefore ask for a baseline early, not after a complex model has consumed the budget.
Metrics need a decision owner and a unit of action. Adam Sroka frames KPI design as top-down alignment with executive decisions in KPI Design & Metrics Strategy around 22:41. He then warns about vanity metrics and KPI gaming around 26:07 and 28:04. For managed projects, Metrics aren’t only dashboard numbers. They’re acceptance criteria, guardrails, and review triggers.
Mesionis makes that concrete with Definition of Done. His team defines KPIs and success criteria before deep delivery work. It also defines fail-fast checks (17:37-25:17). The same project can need an offline metric, an A/B test, and stakeholder feedback. It may also need monitoring and a production support plan.
The Production ML Project Checklist is the closer checklist when the project changes a live system.
Delivery Under Uncertainty
Data science work is hard to estimate because data access, labels, model behavior, and stakeholder needs can change the plan. Shir Meir Lador’s agile ML discussion names that uncertainty directly. Around 41:06, she discusses data risks and unknowns. Around 44:18 and 45:36, she uses exploration tasks and design stories to manage ML work. Grooming practices and iterative milestones also keep the work from pretending it behaves like ordinary feature delivery (Data Science Management and Agile Machine Learning).
Mesionis uses Kanban to plan sprints and estimate work, while demos keep stakeholder feedback in the lifecycle. His team also keeps feasibility assessment, MVPs, and fail-fast checks (20:54-40:49).
That matches the Machine Learning System Design habit of writing goals, non-goals, assumptions, and data paths before the work becomes expensive. Serving constraints and monitoring belong in the same design.
A project manager keeps the delivery unit small enough to learn.
A useful increment might be a small validation or delivery milestone:
- a validated dataset
- a baseline notebook
- a dashboard with agreed metric definitions
- a design document
- a pilot
- a shadow-mode model
- a monitored batch job
For product-facing experiments, A/B Testing and Product Analytics help separate a real rollout decision from a promising internal score.
Evaluation, Adoption, and Handoff
Evaluation is where project management checks whether the work should continue, change, ship, or stop. In From Analytics to Production ML, Rishabh Bhargava describes production ML as experimental. Around 28:42, offline experiments, shadow mode, and A/B tests bridge model work to product impact. Around 31:19, segment analysis and root-cause work explain live results. That’s why Evaluation belongs in the project plan, not only in the modeling phase.
Adoption is also part of completion. Caitlin Moorman argues in Last-Mile Data Delivery that data products can fail when users don’t know they exist. They can also fail when users don’t understand or trust them. Sometimes they don’t see how the product fits the decision (24:13-40:53).
For project management, adoption work includes discoverability, interpretability, and workflow placement. Documentation and feedback loops belong there too.
Production handoff should name the owner of data quality and model behavior. It should also name owners for alerts, rollback, and stakeholder communication. Lina Weichbrodt connects project intake, KPIs, stakeholder fears, and service levels in Human-Centered MLOps and Model Monitoring. She also connects post-mortems, drift, and user feedback to the same operating model (4:50-29:23 and 46:28-50:30).
That handoff links MLOps, Model Monitoring, and Production. A project is unfinished if nobody knows what happens when the metric moves, the input data changes, or the model stops helping the user.