One project that didn’t go as planned was a demand-forecasting model for a fast-moving consumer business. The deep learning model performed extremely well offline, with impressive accuracy on historical data. But once deployed, the business impact was negligible.
What went wrong wasn’t the architecture or training process. It was the assumption that historical patterns were stable enough to learn from. In reality, promotions, supply constraints, and last-minute manual overrides dominated decision-making. The model was predicting a world that no longer existed.
We identified the issue when planners consistently ignored the model’s outputs despite strong validation metrics. Instead of pushing harder on “model adoption,” we paused and studied the workflow. The real problem was that the model was bolted onto an existing process rather than redesigning the process around the model.
What changed:
-
We simplified the model and focused on fewer, decision-critical signals
-
We redesigned the handoff between humans and the system
-
We measured success in terms of decisions influenced, not model accuracy
The key lesson was that deep learning systems fail more often due to organizational and workflow mismatches than technical limitations. Today, I start every project by asking: What decision will this model actually change, and under what conditions will people trust it? If that answer isn’t clear, the model isn’t ready—no matter how good the metrics look.