I'm not interested in AI as a feature layer. The products worth building are the ones where AI does something that genuinely couldn't be done otherwise.
Not a chatbot on a dashboard. Not indexing a folder, but letting any team member find the right brand asset by asking a plain question, in any language, from anywhere in the world. Not rolling out a tool, but deploying frontier AI to 1,200 people across four continents in 17 days and building the playbook that later became Breville's global Claude Enterprise deployment.
On AI trust and safety
I think about trust design before capability. What does the system do when it's wrong? Where does the answer come from, and can the user verify it? Who is accountable for the outcome?
These aren't abstract questions. Some of my work here, but for work or personal project demonstrate how I think about trust while learning on how to build it safely. VAI Santé, my personal Oncology Agent, was built where a hallucination has real consequences. Vivid Alpaca, my autonomous trading agent, has a hard execution guard that blocks model output from reaching the broker if risk controls aren't met. The through-line: model output is advisory. System design determines what happens next.
—
What I won’t ship
AI that replaces human judgment in high-stakes decisions without a visible audit trail.
Systems where confidence is presented without grounding. Products that optimise for engagement at the cost of accuracy. I've pushed back on all three at different points in my career, and I'd do it again.
What I’ve built
I build applied AI systems where the interesting problem isn't the model, it's what happens around it: retrieval design, tool use, evaluation, governance, trust, and the bit where a real person has to decide whether to act on what the system says.
Most of this work sits at the edge of what's productised: local-first architectures, MCP-based tooling, agentic workflows, and high-stakes retrieval. Some of it started as a side project. Some of it ended up in production.
Local first • Currently used personally
VAI Santé
A personal AI operating system for managing complex, fragmented medical information.
Built during cancer treatment to turn months of scattered documents, test results, and clinical notes into auditable timelines, consult-ready summaries, and retrieval-grounded workflows. What's technically interesting:
Local-first architecture with provenance-aware RAG, longitudinal memory, and multimodal document ingestion.
Designed for high-stakes environments where hallucination has real consequences.
MCP • Shipped to production
DAM Butler
An MCP server connecting LLM to Breville's Vault digital asset management system via natural language.
Breville's GTM teams were losing hours finding brand assets in a 235K+ file library built for librarians, not marketers. I vibe-coded an MCP server connecting a custom GPT to the Brandfolder API so anyone could ask a plain question and get the asset.
Demo'd to C-level in September 2025. Engineering adopted the architecture and shipped it. First MCP-based internal tool at Breville. Asset retrieval: hours → minutes.
MCP • Local first
Espresso Horoscope MCP
An offline MCP server that turns espresso shot metrics and a birth date into a personalised "cosmic reading”.
Runs entirely on-device using a local GPT-OSS model via LM Studio. No cloud, no API calls, no data leaving the machine.
Built to explore edge-deployed agentic patterns and offline-first MCP architecture.
Applied ML • Live app
Sourdough Intelligence
A timing calculator for sourdough bakers built on multi-variable regression analysis of fermentation behaviour. Started in 2018 as a raw data and R analysis project. Rebuilt in 2025 with an LLM reasoning layer on top.
What's interesting: it's a case study in the difference between a statistical model and an LLM wrapper. The regression still does the heavy lifting. The LLM handles the explanation.
Multi agents trading framework • Live
Vivid Alpaca
A personalised AI trading research platform built on multi-agent LLM orchestration and Alpaca paper-trading infrastructure.
Agents debate conviction on a ticker. A separate execution guard layer handles risk controls, concentration caps, dynamic sizing, and kill switches before any order reaches the broker.
The engineering value is not market prediction. It's agent orchestration, execution guardrails, UI-configurable goals, and broker-state visibility. Start as paper-trading that runs for 90-days before deploying using real money. Ask me how it’s doing.
Enterprise AI Transformation · Breville
As Breville's first dedicated AI product leader, I've led enterprise AI adoption across global teams spanning GTM product management, engineering, NPD (New Product Development) and R&D, partnering directly with the CTO on AI roadmap and company-wide transformation program.
Key work:
Deployed ChatGPT Enterprise and Codex to 1,200+ employees across Australia, the US, Europe, and Asia. Took us only 17-days from cold email to full deployment
Built the enterprise AI rollout playbook later adopted for Breville's global Anthropic Claude Enterprise rollout
Prototyped DAM Butler MCP (see above). Architecture shipped to production
Built internal agentic workflows, AI literacy programs, and responsible-use frameworks across global teams
Early Claude adopter internally before Breville's wider enterprise rollout
Currently building in my spare time
AI-native video game using Hermes Agent and multi-agent workflows combined with Paperclip using local models and opensource model via OpenRouter and Huggingface.
Eval frameworks for agentic product decisions and blue-collar service workers
Learning AI/ML and LLM fine-tuning local model with niche datasets (ongoing) using Unsloth and Huggingface.
AI TOOLS & STACK
Some tools I use on/off these days:
Models & APIs: Anthropic Claude, OpenAI GPT 5+, OpenRouter, Kimi K2.6, Qwen 3.x variants, Deepseek, Minimax, LM Studio (local), Ollama (local), and Google Gemini.
Orchestration: Paperclip, Hermes Agent, N8N, LangChain, MCP
Workspace Agents (OpenAI), Managed Agents (Antropic), Claude Cowork
Coding agents: Claude Code, Codex, Cursor, OpenCode, Antigravity
Languages & data: Python, R, TS, Node, Jupyter, SQL
Infra & tooling — Vercel, Railway, Netlify
Want to chat?
If something here sparked a question, I'm easy to reach. I'm particularly interested in conversations about enterprise AI deployment, agentic systems, evaluation design, and what makes AI products trustworthy in practice.
Find me on LinkedIn, or drop me an email at vnsavitri@protonmail.ch.