Case Studies

The AI Project That Delivered Nothing: What We Learned

Investing $145K in AI capabilities that never provided business value. Misaligned expectations, poor problem selection, inadequate data, and lessons about AI readiness vs. AI hype.

January 24, 2025
13 min read
By Thalamus AI

The AI Project That Delivered Nothing: What We Learned

Here's the AI failure nobody talks about: you spend $145,000 and 9 months building "AI-powered" capabilities, launch with fanfare, and six months later quietly shut it down because it delivered zero business value.

We worked with a 60-person e-commerce company—call them ShopCo—that got caught in the AI hype cycle. Their competitors were touting "AI-powered personalization" and "machine learning recommendations." The board asked: "Why don't we have AI?"

So they built AI.

The project: AI-powered product recommendations and demand forecasting The investment: $145,000 over 9 months The result: Recommendations were worse than their existing rule-based system, forecasts were less accurate than simple moving averages The outcome: Turned off AI features, went back to simpler approaches, $145K written off as expensive lesson

This is the complete story of an AI project that failed not because the technology didn't work, but because they answered the wrong questions, chose the wrong problems, and didn't have the foundation needed for AI to succeed.

Spoiler: Two years later, they successfully implemented AI—spending $60K on a much simpler problem with much better data. The difference? They learned what AI can actually do vs. what the hype promises.

The Setup: Why They Jumped on AI

First, understand why they thought they needed AI.

The Trigger

Board meeting, Q2 2023:

  • Competitor announced "AI-powered personalization engine"
  • Another competitor: "Machine learning inventory optimization"
  • Board member: "Are we falling behind on AI?"
  • CEO: "We need to invest in AI capabilities"

Classic mistake: Decided they needed AI before identifying what problem AI would solve.

The Initial Vision

What they imagined:

  • Amazon-level product recommendations
  • Netflix-quality personalization
  • Accurate demand forecasting
  • Inventory optimization
  • Competitive advantage through "AI"

The pitch deck (yes, they made a deck):

  • "AI will increase conversion 15-20%"
  • "Forecasting accuracy improvement of 30%"
  • "Reduced inventory carrying costs"
  • "Industry-leading customer experience"
  • Impressive charts and buzzwords

Where these numbers came from: Vendor white papers and competitor press releases.

Phase 1: The AI Recommendation System (Months 1-5)

The Problem (as defined)

Current state:

  • Product recommendations based on simple rules
  • "Customers who bought X also bought Y" (basic co-occurrence)
  • Category-based suggestions
  • Worked okay (2.3% clickthrough, 8% conversion on recommendations)

Desired state:

  • "AI-powered personalization"
  • Consider customer behavior, preferences, context
  • Real-time personalization
  • "Amazon-level recommendations"

The Solution (as built)

Hired AI consulting firm ($85K for this phase):

  • Promised "cutting-edge machine learning"
  • Collaborative filtering algorithms
  • Deep learning neural networks
  • Real-time recommendation engine

Tech stack:

  • Python + TensorFlow
  • AWS SageMaker
  • Real-time inference endpoint
  • "State of the art" architecture

The build process:

Month 1-2: Data preparation

  • Collected 2 years of purchase history
  • User behavior data (clicks, views, cart adds)
  • Product catalog data
  • Customer demographics

Month 3-4: Model development

  • Tried 5 different algorithms
  • Collaborative filtering (user-user and item-item)
  • Content-based filtering
  • Hybrid approach with neural networks
  • Lots of hyperparameter tuning

Month 5: Deployment

  • Real-time API endpoint
  • Integration with website
  • A/B test setup
  • Monitoring dashboard

Launched with excitement.

The Results (Disappointing)

A/B test results (30 days, 50/50 traffic split):

MetricOld SystemAI SystemChange
Rec clickthrough2.3%1.8%-22%
Conversion on rec8.1%6.4%-21%
Avg order value$87$83-5%
Customer satisfaction4.2/54.0/5-5%

AI recommendations were worse across every metric.

What Went Wrong

Post-mortem revealed:

1. Data quality was poor:

  • Purchase history had noise (gifts, one-time purchases, returns not marked)
  • Behavior data was incomplete (30% of users blocked cookies)
  • Many products had sparse data (only sold a few times)
  • Seasonality wasn't handled well

2. Cold start problem:

  • New products: no data, couldn't recommend
  • New users: no history, couldn't personalize
  • Fell back to popularity (which simple system already did)
  • 65% of recommendations were fallback (not AI)

3. Model assumptions didn't match reality:

  • Assumed taste profiles are stable (they change with seasons, trends)
  • Assumed similar users like similar products (many exceptions)
  • Didn't handle gift purchases well (bought for others, not self)
  • Context-blind (same recommendations for browsing vs. buying mode)

4. Technical problems:

  • Model inference slow (350ms average, target was 100ms)
  • Recommendation cache got stale
  • Cost: $1,200/month AWS SageMaker fees
  • Crashed twice during peak traffic

5. Simpler was actually better:

  • "Frequently bought together" was 3x more effective
  • Category suggestions based on current browse were 2x more effective
  • Simple rules were fast, reliable, and worked

Decision: Turned off AI recommendations, went back to rule-based system.

Cost: $85,000 + 5 months

Phase 2: The AI Demand Forecasting (Months 6-9)

"Okay, recommendations didn't work, but forecasting will be perfect for AI!"

The Problem (as defined)

Current state:

  • Demand forecasting using 3-month moving average
  • Manual adjustments for seasonality
  • Purchase orders based on forecast + safety stock
  • Inventory turnover: 4.2x annually
  • Stockout rate: 6.8%
  • Overstock rate: 12.3%

Desired state:

  • AI predicts demand with "30% better accuracy"
  • Optimal inventory levels
  • Reduce stockouts and overstock
  • Better cash flow

The Solution (as built)

Continued with AI consulting firm ($45K this phase):

  • "Machine learning is perfect for time-series forecasting"
  • LSTM neural networks
  • Ensemble models
  • "Industry-leading accuracy"

The build:

Month 6: Data collection

  • 3 years of sales history
  • 2,400 SKUs
  • Seasonal patterns
  • Promotion history
  • External factors (weather, trends, etc.)

Month 7: Model development

  • LSTM for time-series
  • Prophet (Facebook's forecasting tool)
  • Traditional ARIMA for comparison
  • Ensemble combining all three

Month 8: Validation and tuning

  • Back-testing on historical data
  • Hyperparameter optimization
  • Cross-validation
  • Looked promising in testing!

Month 9: Production deployment

  • Automated forecasting pipeline
  • Weekly forecast generation
  • Integration with purchasing system
  • Monitoring and alerts

Launched with cautious optimism.

The Results (Also Disappointing)

90-day evaluation vs. simple moving average:

MetricMoving AvgAI ForecastChange
Forecast accuracy (MAPE)23.4%26.1%-12% worse
Stockout rate6.8%8.2%+21% worse
Overstock rate12.3%14.7%+20% worse
Inventory turns4.2x3.8x-10% worse
Cost$0$800/monthInfinite worse

AI forecasting was worse than the simple 3-month moving average.

What Went Wrong (Again)

Another post-mortem:

1. Not enough data:

  • Many SKUs only had 1-2 years sales history
  • New products: no historical data
  • Seasonal products: only 3-4 data points per year
  • AI models need lots of data, didn't have it

2. External factors were unpredictable:

  • Model tried to learn from weather, social trends, competitor actions
  • These relationships were noisy or non-existent
  • Added complexity without value

3. Promotion effects misunderstood:

  • AI couldn't distinguish organic demand from promotion-driven spikes
  • Forecasts assumed promotions would repeat
  • Real purchasing decisions were more strategic

4. Simple worked better:

  • Moving average was stable, understandable
  • Easy to adjust manually for known factors
  • Buyers had domain knowledge AI didn't capture
  • "Black box" AI couldn't explain its predictions

5. Overfitting:

  • Models fit historical data well
  • Failed on future data (what matters)
  • Classic overfitting problem

Decision: Abandoned AI forecasting, went back to moving averages + buyer judgment.

Cost: $45,000 + 4 months

The Total Failure

Financial Cost

PhaseInvestmentResultValue
AI Recommendations$85,000Turned off after 30 days$0
AI Forecasting$45,000Abandoned after 90 days$0
Opportunity cost$15,000Delayed actual improvementsNegative
Total$145,000Nothing deployed$0

Organizational Cost

Beyond money:

  • 9 months focused on AI instead of real improvements
  • Team morale hit ("we spent all that for nothing?")
  • Board skepticism ("AI doesn't work")
  • Delayed simpler initiatives that would have helped
  • Reputational cost (quietly removed "AI-powered" from marketing)

The Real Opportunity Cost

What they could have done with $145K and 9 months:

  • Improved product photography ($30K) - proven ROI
  • Better product descriptions ($15K) - proven ROI
  • Faster website (performance optimization: $25K) - proven ROI
  • Better search functionality ($35K) - proven ROI
  • Email marketing automation ($20K) - proven ROI
  • Customer service tooling ($20K) - proven ROI

Estimated combined impact: 15-25% revenue increase

Instead: Chased AI and got nothing.

The Post-Mortem: What We Got Wrong

1. Started with Solution, Not Problem

Mistake: "We need AI" came before "what problem are we solving?"

Should have: Identified business problems, evaluated if AI was right solution

Lesson: AI is a tool, not a goal. Start with problems, not technology.

2. Unrealistic Expectations

Mistake: Expected Amazon/Netflix-level results with fraction of their data and investment

Should have: Understood their data limitations and what's actually achievable

Lesson: Amazon has billions of data points and spent millions building their recommendation engine. You have thousands of data points and $85K. Results will differ.

3. Data Readiness

Mistake: Didn't validate data quality before building models

Should have: Data audit first, model second

Lesson: "Garbage in, garbage out" isn't just a saying. Most AI projects fail on data, not algorithms.

4. Ignored the Baseline

Mistake: Didn't respect that simple solutions were working okay

Should have: Quantified baseline performance, set realistic improvement targets

Lesson: When simple rules achieve 2.3% CTR and you need AI to beat that, you're fighting uphill. AI's advantage is when simple rules don't exist.

5. Wrong Problems for AI

Mistake: Chose problems where AI wasn't actually better solution

Should have: Evaluated if AI was right tool for these specific problems

Lesson: Not every problem is an AI problem. Sometimes simple rules, statistics, or human judgment are better.

6. Consultant Incentives

Mistake: Hired firm that sold AI, expected objective advice

Should have: Separated problem assessment from solution delivery

Lesson: If you ask an AI consulting firm if you need AI, answer is always "yes."

Two Years Later: Successful AI Implementation

The story doesn't end with failure. ShopCo learned and tried again.

The Different Approach (2025)

Problem identification first:

  • Customer support receiving same questions repeatedly
  • 40% of support tickets were simple FAQs
  • Support team spending 15 hours/week answering same questions
  • Cost: $35,000 annually in support time

Solution evaluation:

  • Could AI chatbot handle FAQs?
  • Problem well-suited: narrow domain, lots of examples, clear success metric
  • Alternative: better FAQ page (cheaper but less effective)
  • Decision: Try AI, but with realistic expectations

Implementation:

  • Used OpenAI API (no custom model training)
  • $60K for implementation (chatbot + knowledge base integration)
  • 6-week timeline
  • Clear success metric: Reduce FAQ support tickets by 50%

Results:

  • FAQ tickets reduced 67% (better than target)
  • Support team time freed: 22 hours/week
  • Customer satisfaction maintained (4.1/5)
  • Cost: $60K one-time + $400/month API costs
  • ROI: Paid for itself in 10 months

Why This Worked

Right problem:

  • Narrow, well-defined
  • Lots of training data (support ticket history)
  • Clear success metric
  • AI actually better than alternatives

Realistic expectations:

  • Target: 50% reduction (not 90%)
  • Knew it wouldn't be perfect
  • Human handoff for complex questions
  • Measured actual performance, not projected

Appropriate technology:

  • Used existing AI (OpenAI) instead of building custom
  • Simpler, faster, cheaper
  • Focused on integration and UX, not AI research

Data quality:

  • Support tickets were well-structured
  • Questions were labeled and categorized
  • High-quality training data

Lessons for AI Projects

Before Starting Any AI Project

Questions to answer honestly:

  1. What specific business problem are we solving?

    • Not "we need AI"
    • Specific, measurable problem
  2. How does the current solution work?

    • What's the baseline performance?
    • Why isn't it good enough?
  3. Why is AI the right solution?

    • What can AI do that simple rules/statistics can't?
    • What are the alternatives?
  4. Do we have the data?

    • Quality: Is it accurate and clean?
    • Quantity: Do we have enough examples?
    • Labeling: Is it properly categorized?
  5. What does success look like?

    • Specific metrics
    • Realistic targets (not vendor promises)
    • Timeline for evaluation
  6. What's the fallback plan?

    • If AI doesn't work, then what?
    • Can we fail fast and cheap?

Red Flags for AI Projects

  • "We need AI" before identifying specific problem
  • Expectations based on Google/Amazon results (you're not Google/Amazon)
  • No baseline performance measurement
  • Data assessment comes after project starts
  • Success metrics are vague ("better personalization")
  • Vendor/consultant selling AI, not solving problems
  • Board/executive pressure to "do AI"
  • No plan B if AI doesn't work

Green Flags for AI Projects

  • Specific business problem with measurable impact
  • Current solution has clear limitations
  • AI demonstrably better than alternatives
  • High-quality, abundant data available
  • Realistic success metrics set
  • Pilot/MVP approach with fast feedback
  • Team understands AI capabilities and limitations
  • Clear ROI path

When AI Makes Sense (and When It Doesn't)

Based on failure + eventual success:

Good AI Problems

  • Large amounts of data (10,000+ examples)
  • Narrow, well-defined domain
  • Clear success metrics
  • Pattern recognition (not prediction)
  • Humans struggle with scale (not complexity)
  • Acceptable error rate
  • Lots of examples to learn from

Bad AI Problems

  • Small datasets (hundreds of examples)
  • Broad, ill-defined domain
  • Vague success criteria
  • Forecasting complex systems with many unknowns
  • Simple rules work well enough
  • Zero error tolerance required
  • Trying to replace human judgment on complex decisions

The Bottom Line

ShopCo spent $145,000 and 9 months on AI projects that delivered zero value.

Then they spent $60,000 and 6 weeks on an AI project that solved a real problem and paid for itself.

The difference wasn't the technology. It was the approach.

First attempt:

  • "We need AI" → find problems for AI to solve
  • Unrealistic expectations from vendor promises
  • Poor data, wrong problems, no baseline

Second attempt:

  • Real problem → evaluate if AI is right solution
  • Realistic expectations based on actual data
  • Right problem, good data, clear metrics

AI isn't magic. It's a tool. Like any tool, it works great for some jobs and terribly for others.

The expensive lesson: Figure out if you have an AI problem before building an AI solution.

We're Thalamus. Enterprise capability without enterprise gatekeeping.

If you're considering AI projects, we should talk. Not to sell you AI (we'll tell you if you don't need it), but to help you identify if AI is the right tool for your actual problems.

Sometimes the most valuable consulting is preventing you from wasting $145K on AI you don't need.

And sometimes the best AI strategy is knowing when not to use AI.

Related Products:

Related Articles

Ready to Build Something Better?

Let's talk about how Thalamus AI can help your business scale with enterprise capabilities at SMB pricing.

Get in Touch