ML-Powered Recommendations: 32% AOV Increase
The Problem
ShopScale was stuck in the middle. Too big for basic recommendations, too small for Amazon’s proprietary ML. Their existing system used simple collaborative filtering:
# Old approach: item-to-item similarity
def get_recommendations(user_id):
purchases = get_user_purchases(user_id)
similar_items = find_similar(purchases)
return similar_items[:10]
Results? Underwhelming:
- AOV stuck at $72 (industry avg: $85)
- Cart abandonment: 68%
- Conversion rate: 2.1% (industry avg: 3.2%)
CTO Lisa Park: “We were leaving money on the table. Every unc personalized impression was lost revenue.”
The Solution
Kurai built a three-tier recommendation system in 10 weeks:
Tier 1: Real-time Personalization
from sklearn.metrics.pairwise import cosine_similarity
import redis
import numpy as np
# Hybrid scoring function
def score_item(user, item, context):
# Collaborative filtering (40%)
cf_score = collaborative_filter_score(user, item)
# Content-based (30%)
cb_score = content_similarity(user.preferences, item.features)
# Contextual (20%)
context_score = context_relevance(item, context.time, context.device)
# Popularity boost (10%)
popularity_score = item.global_popularity
return (
0.40 * cf_score +
0.30 * cb_score +
0.20 * context_score +
0.10 * popularity_score
)
Tier 2: Batch Training Pipeline
- Nightly model retraining (2M user-item interactions)
- Feature engineering: 150+ features per user/item
- Matrix factorization: SVD for dimensionality reduction
- A/B testing: 5% traffic to new models before rollout
Tier 3: Multi-Armed Bandit
- Exploration vs. exploitation balance
- Thompson sampling for cold-start
- Contextual bandits for seasonal items
Infrastructure:
- Model serving: FastAPI + Uvicorn (async)
- Caching: Redis Cluster (6 nodes, 50K QPS)
- Feature store: PostgreSQL + materialized views
- Monitoring: Evidently AI for model drift detection
- Cloud: AWS (m5.2xlarge instances, auto-scaling)
The Results
Immediate Impact (Month 1):
- AOV: $72 → $88 (+22%)
- Cart abandonment: 68% → 52%
- Conversion rate: 2.1% → 2.8%
After Optimization (Month 3):
- AOV: $72 → $95 (+32%)
- Cart abandonment: 68% → 49%
- Conversion rate: 2.1% → 3.1%
- Recommendations CTR: 3.2% → 8.7%
Revenue Impact:
- Previous annual revenue: $180M
- New annual revenue: $192.5M
- Increase: $12.5M/year
Performance:
- Latency: P50 15ms, P95 40ms, P99 85ms
- Throughput: 50K recommendations/second
- Uptime: 99.97% (SLA guaranteed)
- Cache hit rate: 92%
Recommendation Types
-
Homepage Personalization (38% of impressions)
- “Recommended for You” based on browsing history
- 12.3% CTR (vs. 4.1% baseline)
-
Product Page (45% of impressions)
- “Customers Also Bought” + “Complete the Look”
- 9.7% add-to-cart rate (vs. 2.9%)
-
Cart Abandonment (12% of impressions)
- Email recommendations based on cart contents
- 18.4% recovery rate (vs. 8.2%)
-
Post-Purchase (5% of impressions)
- Cross-sell recommendations
- 22% repeat purchase rate
Customer Feedback
“Our AOV jumped $20 in three months. The recommendations feel scarily good—it’s like the system knows what I want before I do.”
— James Wilson, ShopScale Merchant
What’s Next
Phase 2 includes:
- Visual search: “Shop the look” from uploaded photos
- Voice recommendations: Integration with Alexa/Google Home
- Social proof: Show friend purchases in recommendations
- Dynamic pricing: ML-optimized discounts per user
Technology Stack
- ML Framework: scikit-learn + LightFM
- Serving: FastAPI + Uvicorn (async)
- Caching: Redis Cluster with Sentinel
- Database: PostgreSQL 15 with pgvector
- Queue: RabbitMQ for batch jobs
- Monitoring: Prometheus + Grafana
- Experiment tracking: MLflow
- Infrastructure: AWS ECS + DynamoDB
Key Metrics
| Metric | Before | After | Improvement |
|---|
| AOV | $72 | $95 | +32% |
| Cart abandonment | 68% | 49% | -28% |
| Conversion rate | 2.1% | 3.1% | +48% |
| Rec CTR | 3.2% | 8.7% | +172% |
| Add-to-cart rate | 2.9% | 9.7% | +234% |
ROI Breakdown
- Investment: $450K (development + infrastructure)
- Annual revenue increase: $12.5M
- Annual infra cost: $120K
- Net ROI: 2,578% in Year 1
Timeline
- Week 1-3: Data pipeline and feature engineering
- Week 4-6: Model development and offline testing
- Week 7-8: Real-time serving infrastructure
- Week 9: A/B testing (10% traffic)
- Week 10: Full rollout
Lessons Learned
- Cold start is hard: New users got poor recommendations; solved with popularity-based fallback
- Cache everything: 92% cache hit rate reduced load by 10x
- Measure business metrics: ML accuracy matters, but revenue matters more
- A/B test everything: One “better” model actually hurt revenue by 4%
ShopScale’s recommendation engine now drives $1M+ in revenue monthly. They’re expanding to fashion-specific features like “shop the look” and size recommendations.