AI / ML Engineer · Canada

I build AI systems
that work in production.

Six years taking hard problems from whiteboard to deployment. I think in trade-offs, own the full stack, and tie every model decision to business outcomes.

6+
Years in ML
12+
Models Shipped
$2M+
Business Impact
ML EngineeringEnd-to-end: architecture, training pipelines, deployment, monitoring.
NLP & LLM SystemsRAG, fine-tuning, prompt engineering — grounded in eval data.
MLOps & InfrastructureFeature stores, CI/CD for models, cloud-native at scale.
Applied ResearchI know when to use SOTA and when to ship something simpler.
What sets me apart — click to expand
01 I think in trade-offs
Every model decision has a cost. I document them, communicate them clearly, and own the consequences — latency vs. accuracy, cost vs. performance, complexity vs. maintainability.
02 Production-first mindset
A model that can't scale, monitor or fail gracefully is a prototype. I engineer for the real world from the first design decision — not as an afterthought.
03 Business-driven metrics
I tie model performance to business outcomes. The question is never just "what's the accuracy?" — it's "does moving this metric 5% actually change what matters to the company?"
04 Clear across every room
I can explain a transformer to a CTO and a deployment pipeline to a DevOps engineer — both precisely, neither condescendingly. Communication is part of the job.
Experience
2022–
Senior ML Engineer
TechCorp Inc.
Led cross-functional ML team. Shipped 4 production models. Reduced inference latency 40% via quantization and batching.
2020
ML Engineer
DataAI Solutions
NLP pipelines for document classification and entity extraction. Full lifecycle ownership from ingestion to A/B testing.
2018
Data Scientist
StartupXYZ
Churn prediction and demand forecasting. Worked directly with product to operationalize insights.
Technical Stack
Languages
PythonSQLBashGo
ML / Frameworks
PyTorchHuggingFacescikit-learnXGBoostLangChainONNX
MLOps / Cloud
AWS SageMakerMLflowDockerKubernetesAirflowTerraform
Data
SparkPostgreSQLSnowflakeKafka
What I'm looking for

Seeking a role where technical depth is valued, not just output speed. I thrive when there are real, difficult problems — not just dashboards to maintain or models to retrain on a schedule.

Ideally: a product-focused team in Canada, working on applied AI with meaningful scale. I want ownership and colleagues who hold each other to high standards.

Open to full-time or contract, remote or hybrid. Not the right fit for pure analyst roles or organizations where ML is still in the "exploration" phase with no deployment path.

01
Real-Time Fraud Detection
Fintech · 2023 · Production
Live XGBoost Kafka AWS
+
Problem
Mid-size fintech losing ~$1.4M/year to fraud. Rule-based system had a 38% false positive rate, blocking real customers. Fraud team drowning in manual review with no scalable path.
Solution
Two-layer detection: lightweight XGBoost for real-time scoring <30ms, plus daily retraining adapting to new patterns. Kafka for event streaming; rolling window features for behavioral signals.
Stack
PythonXGBoostKafkaAWS LambdaSageMakerMLflowPostgreSQL
Results
$1.1M
Fraud losses prevented / yr
−68%
False positive rate
28ms
p95 inference latency
99.4%
Uptime over 12 months
Fraud Detection Dashboard
🖼
[ Screenshot or GIF — to be added ]
Engineering Thinking — decisions & trade-offs
Why XGBoost?
Evaluated vs LightGBM and a shallow neural net. XGBoost was the best latency–accuracy fit under 50ms SLA. Neural net was 2% more accurate but 4× slower — not justified.
Trade-off
Daily retraining over online learning — more predictable, auditable, and easy to roll back. Online learning adds risk of concept drift poisoning at production scale.
Limitations
Struggles with coordinated fraud rings below individual thresholds. A graph-based layer would add ~15% recall — on the v2 roadmap.
[ System Architecture Diagram — to be added ]
02
Internal Knowledge Assistant (RAG)
Enterprise SaaS · 2024 · Production
Live LLM · RAG Pinecone
+
Problem
400-person SaaS company with 7+ years of scattered internal docs. New employees took 3+ months to onboard; senior engineers spent 6–8 hrs/week answering repetitive questions.
Solution
RAG-based Q&A with source attribution and confidence scoring. Indexes all internal docs with access control, logs queries for continuous evaluation. Slack integration for zero-friction adoption.
Stack
PythonLangChainGPT-4PineconeFastAPISlack APIAWS S3
Results
−72%
Repetitive internal queries
3.1 wks
Avg. onboarding (was 13)
91%
User satisfaction (n=280)
RAG Chat Interface
🖼
[ Screenshot — to be added ]
Engineering Thinking — decisions & trade-offs
Retrieval
Tested BM25, dense-only, and hybrid. Hybrid + Cohere re-ranking improved relevance 23% on our 200-query eval set. The eval set came first — then the decision.
Trade-off
GPT-4 over a self-hosted model for quality and time-to-value. Data privacy handled via access control at ingestion — no cross-team leakage by design.
Limitations
Hallucinations on ambiguous queries (~4%). Source attribution mitigates trust erosion. Adversarial eval pipeline in progress to catch edge cases pre-production.
[ RAG Pipeline Diagram — to be added ]
03
Demand Forecasting at Scale
Retail / E-commerce · 2022 · Production
Shipped Time Series Spark
+
Problem
E-commerce retailer over-stocking seasonal items by 22% on average — markdown losses and warehousing costs adding up. Existing forecasts used moving averages with no external signals and couldn't scale past a few hundred SKUs.
Solution
Hierarchical forecasting with Prophet + XGBoost ensembling: weather data, promotional calendars, trend signals. Spark for parallel training across 50K+ SKUs. Predictions fed directly into inventory management via API.
Stack
PythonProphetXGBoostPySparkAirflowSnowflakeFastAPI
Results
−31%
Overstock rate vs. baseline
$870K
Saved in year one
50K+
SKUs forecasted daily
Forecast Visualization
🖼
[ Charts — to be added ]
Engineering Thinking — decisions & trade-offs
Model Choice
Ensemble beat any single model by 8% MAPE. Prophet handles trend/seasonality cleanly; XGBoost learns residuals from external signals. Clean separation of concerns.
Trade-off
Bottom-up reconciliation over top-down: SKU-level explainability matters because buyers need to trust the number, not just use it. Interpretability was a product requirement.
Limitations
Degrades on new products with <8 weeks of history. Cold-start heuristics applied for now. Meta-learning approach for cold-start on the v2 roadmap.
[ Forecasting Pipeline Diagram — to be added ]