How Meridian Technologies Cut AI Costs by 84% Across Three Clouds
Every enterprise AI story starts the same way. Someone on the engineering team spins up an OpenAI integration for a prototype. It works. It ships. Then another team hears about it and sets up their own AI pipeline on a different cloud because that's what they know. Before long, you've got three billing dashboards, three sets of credentials, three governance policies, and a finance team that can't answer a simple question: how much are we actually spending on AI?
That's exactly what happened at Meridian Technologies, a mid-size fintech serving over two million customers across North America. With 500 employees and roughly 50 AI developers spread across multiple teams, Meridian had embraced artificial intelligence faster than most companies their size. The fraud detection team built their models on AWS Bedrock. Customer experience went with Azure OpenAI. The data science group preferred GCP Vertex AI. Within 18 months, what started as healthy experimentation had turned into full-blown multi-cloud AI sprawl.
Three Clouds, Zero Visibility
The pain wasn't theoretical. Meridian's finance team estimated their total AI spend was "somewhere between $15K and $40K per month," which is a remarkable range for a company that prides itself on financial precision. The truth is nobody knew the real number, because each team managed their own provider relationship, their own billing, and their own cost tracking. Getting a unified picture meant pulling three separate invoices, normalizing the data manually, and hoping nobody had spun up an expensive model without telling anyone.
"Somewhere between $15K and $40K per month" — that was their best estimate for total AI spend. For a company that prides itself on financial precision, it was a wake-up call.
“
But cost blindness was only the beginning. As a fintech handling sensitive financial data, Meridian needed compliance across SOC 2, HIPAA, GDPR, and ISO 27001. With three separate AI environments, that meant three separate audit trails, three sets of access controls, and three compliance reviews. Their compliance team was essentially doing the same work three times over, and gaps were inevitable. There was no single place to answer the question regulators would eventually ask: who accessed what model, with what data, and when?
Then there was the knowledge problem. Meridian had years of internal documentation covering everything from fraud detection procedures to customer refund policies to API specifications. But this knowledge was trapped in wikis and shared drives that no AI model could access. The customer support bot couldn't reference fraud team documentation. The compliance automation tool didn't know about the latest product specifications. Every AI system operated in its own information silo, which meant every AI system was essentially guessing when it should have been referencing authoritative sources.
The Tipping Point
The breaking point came when Meridian's engineering leadership realized their developers were spending over 20% of their time managing AI infrastructure instead of building AI features. Each team had to maintain its own SDK integrations, handle its own failover procedures, and manage its own model deployments. When Azure had an outage, the customer experience team scrambled to manually reroute traffic. When AWS changed their Bedrock pricing, the fraud team had to recalculate their entire budget. And nobody had built cross-cloud failover because the engineering effort required to bridge three completely different APIs felt insurmountable.
The leadership team evaluated several options: standardizing on a single cloud provider (politically impossible given existing team investments), building a custom abstraction layer (estimated at 6 months of engineering time), or finding a platform that could unify everything without requiring a rewrite.
Enter Bonito
Bonito gave Meridian something none of the other options could: a single AI control plane that connected all three cloud providers through one unified API, without requiring any team to abandon their existing setup. The architecture is straightforward. Bonito sits between Meridian's applications and their cloud AI providers, presenting a single OpenAI-compatible API endpoint that works identically regardless of whether a request is routed to AWS Bedrock, Azure OpenAI, or GCP Vertex AI.
The initial deployment connected all three providers in a single afternoon. Bonito automatically cataloged 381 models across the three clouds, with 241 active and ready for routing. Twelve deployments went live across all providers, from GPT-4o and GPT-4o-mini on Azure to Nova Lite and Nova Pro on AWS to Gemini 2.5 Pro and Gemini 2.5 Flash on GCP. Every single deployment checked in healthy. For the first time, Meridian had a single dashboard showing the status of their entire AI infrastructure.
Smart Routing Changes the Economics
The real transformation came from Bonito's routing policies. Before Bonito, every team defaulted to their most capable (and most expensive) model for every request. The customer support team was sending simple FAQ lookups through GPT-4o at $0.005 per request. The fraud team was running basic classification tasks through expensive models because switching to a cheaper alternative would have meant rewriting their integration code.
Bonito's cost-optimized routing policy changed that calculus entirely. Simple queries that make up roughly 60% of Meridian's AI traffic now route automatically to lightweight models like Amazon Nova Lite and Gemini Flash Lite at near-zero cost. Medium-complexity tasks, representing about 25% of traffic, go to models like Gemini 2.5 Flash and Nova Pro at a fraction of premium pricing. Only the truly complex reasoning tasks, around 15% of total volume, still route to premium models like GPT-4o and Gemini 2.5 Pro.
The production testing validated this approach with hard numbers. Across 187 test requests mirroring Meridian's real workload distribution, the average cost per request dropped to $0.000214. At Meridian's scale of 50,000 requests per day, that translates to an API cost reduction from $1.825 million per year down to roughly $182,500, a 90% savings on the single largest line item in their AI budget.
But Bonito also added cross-cloud failover, something Meridian never had. A single routing policy now designates GCP Gemini as the primary model with automatic fallback to AWS Nova Pro and then GCP Gemini 2.0 Flash. If any provider goes down at 2 AM, traffic reroutes automatically. No pagers. No manual intervention. No customer impact.
AI Context: The Knowledge Breakthrough
Perhaps the most transformative capability was Bonito's AI Context feature, the built-in RAG engine that creates a shared knowledge layer across all models and all clouds. Meridian uploaded five core documents covering their product documentation, compliance procedures, and operational policies. Bonito chunked those into 49 searchable segments totaling about 24,000 tokens, and suddenly every model across every cloud had access to the same authoritative company knowledge.
The difference in output quality was stark. Without RAG, asking Amazon Nova Lite about Bonito's capabilities produced a generic textbook answer about AI operations platforms with no company-specific information whatsoever. With RAG enabled, the same model on the same cloud returned grounded, accurate responses referencing actual product features and documentation. The same pattern held across every model tested on all three clouds.
Search performance came in fast. Eight out of ten knowledge queries returned results in under 500 milliseconds, with an average search time of 484ms across the full test suite. Relevance scores were strong, with the best queries achieving a 0.8335 similarity score. For Meridian, this means every AI tool across every department can now reference the same source of truth in near-real-time, whether it's the fraud detection system checking the latest compliance policy or the customer support bot looking up refund procedures.
The Numbers Tell the Story
When you add it all up, the projected annual savings are substantial. API cost savings of $1.64 million from smart routing. Operations savings of $300,000 from consolidating three platform management teams into one. Compliance savings of $100,000 from unified auditing instead of three separate reviews. Infrastructure savings of $270,000 from replacing three separate RAG pipelines and monitoring stacks with Bonito's built-in capabilities.
Against a Bonito platform cost of $60,000 per year on the Enterprise tier, Meridian's projected annual savings come to $2.25 million, representing an 84% total cost reduction and a 37.5:1 return on investment. The payback period is under 10 days.
And perhaps more importantly, Meridian's developers got their time back. Adding a new AI model went from a 2-3 week project to a 5-minute configuration change. Setting up a RAG pipeline dropped from 4-6 weeks per cloud to 30 minutes total. Creating a new routing policy went from weeks of custom code to a 2-minute setup in the dashboard. Compliance audit preparation compressed from a three-month, three-environment ordeal to a single click generating a unified report.
What This Means for Your Team
Meridian's story isn't unique. It's the story of every enterprise that adopted AI organically and is now dealing with the consequences of fragmentation. If you're running AI workloads across two or more cloud providers, if your finance team can't give you a straight answer on total AI spend, if your compliance team is auditing the same thing three different ways, the operational overhead is eating into the value AI is supposed to create.
Bonito was built for exactly this moment. A single control plane that connects your existing providers, routes intelligently across all of them, shares knowledge universally, and gives you the visibility and governance you need to operate AI at scale. You don't have to rip anything out. You don't have to pick a winner among your cloud providers. You just connect them all and let the platform do what platforms do best.
If Meridian's story resonates, start with a free account and connect your first provider. It takes about five minutes, and you'll immediately see what unified AI operations look like.