Back to Blog
Multi-CloudCost OptimizationAI AgentsCase StudyAI OrchestrationAPI Gateway
Feb 22, 2026 · 9 min read · Shabari, Founder

How 7 AI Agents and Multi-Cloud Routing Cut Ad-Tech AI Costs by 30%

Axiom Media Group manages $40 million in annual ad spend for 85+ DTC brands across Meta, Google, TikTok, and CTV. Their programmatic advertising platform automates creative generation, audience segmentation, bid optimization, and campaign reporting. Ad-tech AI isn't a nice-to-have for them — it's the core of the product.

And their AI infrastructure was a mess.

Three Clouds, One Giant Bill Nobody Could Explain

Multi-cloud AI sprawl in ad-tech

Like most ad-tech companies that adopted AI organically, Axiom ended up with three separate cloud AI stacks. AWS Bedrock for creative asset generation. GCP Vertex AI for audience clustering and conversion prediction. Azure OpenAI for ad copy and sentiment analysis. Each stack had its own SDK integration, its own billing dashboard, its own set of API keys managed by different engineers.

Total monthly AI spend: $8,200. But nobody could tell you how much went to creative generation versus bid optimization versus sentiment monitoring. When a client asked "how much AI went into my campaign?", the answer was a shrug.

Three clouds, three billing dashboards, $8,200/month — and when a client asked how much AI went into their campaign, nobody could answer.

The cost problem went deeper than visibility. Axiom was using GPT-4o for everything. Generating 50 ad headline variations? GPT-4o. Running basic sentiment classification on social mentions? GPT-4o. Writing product descriptions from spec sheets? GPT-4o. About 70% of their AI requests were routine text generation that didn't need a frontier model, but there was no routing layer to differentiate. Every request hit the most expensive model because that's what was configured.

The Wrong-Model Tax

We see this pattern constantly. Teams start with one model, it works well, and it becomes the default for everything. The problem isn't quality — GPT-4o produces excellent ad copy. The problem is economics at scale.

A quick breakdown of what Axiom was paying versus what they could be paying:

  • Bulk ad copy generation (50 headlines, product descriptions, social captions): Running on GPT-4o at $2.80 per 1K requests. Could run on Gemini Flash at $0.10 per 1K requests — identical quality for template-based creative.
  • Sentiment classification (social mentions, review analysis): Running on GPT-4o. Could run on Nova Lite at $0.07 per 1K requests — this is simple classification, not reasoning.
  • Strategic campaign analysis (bid optimization, competitive intelligence): Legitimately needs a capable model. GPT-4o or Gemini Flash for the reasoning depth required.

The math is straightforward: 70% of requests could move to models that cost 3-40x less without any quality degradation. The remaining 30% stays on premium models where the quality difference matters. That's a 30% cost reduction from routing alone.

70%
of requests didn't need a frontier model
30%
cost reduction from smart routing alone
$28,800/yr
saved just by matching models to tasks

What Bonito Changed

Axiom connected all three providers — AWS, GCP, and Azure — through Bonito's gateway in a single afternoon. The platform cataloged 379 models across all three clouds. For the first time, anyone on the team could see every available model, its pricing, and its capabilities from one dashboard.

The cloud AI routing configuration took about ten minutes. A cost-optimized model routing policy sends creative generation and classification to the cheapest capable model (Nova Lite for high-volume classification, Gemini Flash for creative drafts). A quality-first policy keeps strategic analysis on GPT-4o Mini or Gemini Flash for reasoning depth. Failover chains ensure that if any provider goes down, traffic automatically reroutes — the kind of AI cost optimization that pays for itself immediately.

But cost savings from routing was only half the story. The real transformation came from Bonobot agents.

What the Creative Director Agent Actually Does

Bonobot Creative Director and Bid Optimizer agent workflows

Axiom deployed 7 Bonobot agents across two projects. Let me walk through what these agents actually handle, because "campaign automation" is vague and vague doesn't close deals.

The Creative Director agent generates platform-specific ad creative on demand. A typical request looks like this: *"Generate 5 Meta ad headlines for a DTC skincare brand launching a $38 Vitamin C serum. Target: women 25-40, urban, health-conscious. Key differentiator: dermatologist-developed, clean ingredients. Include emotional hooks."*

The agent comes back with three tiers — safe, bold, and experimental — for each variation. It knows Meta's character limits, Google's quality score factors, and TikTok's unique format requirements. When someone needs a TikTok script, the agent understands that you need a hook in the first two seconds or the viewer is gone: *"Write TikTok ad scripts (15-sec and 30-sec) for a protein bar brand targeting gym-goers 18-30. Tone: energetic, slightly irreverent. Must include hook in first 2 seconds. Product: 20g protein, no artificial sweeteners, $2.49/bar."*

It also evaluates existing creative. Feed it three Google Search headlines for a running shoes campaign with a $35 target CPA, and it scores each on clarity, urgency, relevance, and click-worthiness — with reasoning for every score. This is work that a mid-level copywriter spends 2-3 hours on per client per week. The agent does it in 30 seconds.

How the Bid Optimizer Catches Problems Before They Cost Money

The Bid Optimizer is where the real operational leverage comes from. It analyzes live campaign metrics and produces specific, actionable recommendations with dollar amounts attached.

Here's a real example from testing: *"Analyze this Meta campaign: Spend $4,200/week, Impressions 890K, Clicks 12,400, CTR 1.39%, CPC $0.34, Conversions 445, CPA $9.44, ROAS 4.2x. Frequency 3.8 (up from 2.1 last week). Audience: Lookalike 1% US women 25-44."*

The agent spots what a junior analyst might miss: that frequency spike from 2.1x to 3.8x means the audience is getting saturated. ROAS is still at 4.2x — it looks fine right now — but historical patterns show that once frequency exceeds 4x, ROAS typically degrades within 5-7 days. The agent recommends refreshing the audience or broadening the lookalike percentage *before* performance drops, not after.

It also handles cross-channel budget allocation: *"Compare 3 Google ad groups: Brand (CPC $1.20, Conv rate 8.2%, CPA $14.63), Competitor (CPC $3.40, Conv rate 3.1%, CPA $109.68), Generic (CPC $0.89, Conv rate 1.8%, CPA $49.44). Total daily budget $500."* The agent recommends shifting $200 from the Competitor group (CPA 7.5x the Brand group) to Brand, and testing a reduced Generic budget with tighter keyword matching.

When CPMs spike 84% in five days ($6.20 to $11.40 on Meta), the agent differentiates between seasonal pressure, competitive bidding, and audience fatigue — and recommends different responses for each scenario.

The Bid Optimizer doesn't just report numbers — it catches problems before they cost money. A frequency spike from 2.1x to 3.8x looks fine today but predicts ROAS degradation within a week. The agent recommends action before the client's next report shows a downturn.

Audience Analysis and Client Intelligence

Bonobot agent architecture for campaign operations and client intelligence

The Audience Analyst works with real cohort data. Feed it a 12,400-customer dataset broken down by age, AOV, repeat rate, and acquisition channel, and it identifies that the 35-44 segment ($61 AOV, 35% repeat rate) is the highest-LTV target worth concentrating spend on — even though the 25-34 segment is larger by volume.

For a luxury candle brand launching on Meta, it designs a specific lookalike testing sequence: seed with 890 repeat buyers (3+ orders) at 1% lookalike first (highest intent), then broaden to 3% and 5% based on performance. When a retargeting audience hits 8.2x frequency and ROAS drops from 6.1x to 3.4x, the agent diagnoses audience exhaustion and recommends restructuring into a 3-tier engagement funnel with different messaging at each stage.

The Performance Reporter generates client-facing weekly reports with real numbers and honest commentary. For GlowUp Skincare: Meta ROAS 4.79x ($3,800 spend, 612 purchases, CPA $6.21), Google 3.52x, TikTok 2.33x — overall ROAS dipped from 4.1x to 3.99x WoW. When a client escalates because ROAS dropped from 5.2x to 2.8x over three weeks, the agent doesn't sugarcoat it — it identifies that new creative launched in Week 2 coincides with the decline and drafts a recovery plan.

The Market Research agent produces competitive intelligence with specific numbers. For the DTC beauty vertical: Meta CPMs up 22% YoY, TikTok Shop launched a beauty category with a 40% commission reduction that's pulling spend, average CAC now $28 (up from $21 in 2024). For a coffee subscription client spending $8K/month at 3.1x ROAS, it maps competitive positioning against VC-backed Trade Coffee ($50M raised) and mass-market Starbucks Reserve with channel-specific recommendations.

The Contract Analyzer reviews media plans and flags risks that account managers might miss. A $120K Q2 plan for UrbanGear Apparel: the agent calculates expected revenue from the 15% management fee ($18K), flags the 15-day cancellation clause as a risk against the 3-month commitment, and models whether the 10% performance bonus (if ROAS exceeds 5x) is realistically achievable based on the category's historical data.

The Sentiment Monitor classifies social mentions and flags brand safety risks before they crater ad performance. For a reusable water bottle brand, five social posts come in — the agent identifies a product defect cluster (two separate mentions of lid leaking, one metallic taste complaint) and a customer service failure (ghosted for 2 weeks on a warranty claim). It classifies the defect cluster as the top risk and recommends pausing ad spend until the lid issue is publicly addressed. One unresolved product defect thread going viral can tank ad engagement overnight.

The Production Numbers

We validated the entire architecture with end-to-end testing on Bonito's production infrastructure. Here's what we measured:

Gateway performance: 74 requests across all three clouds. AWS Nova Lite: 36 requests (34 successful, 2.6s average latency). GCP Gemini Flash: 21 requests (100% success, 5.2s average latency). Azure GPT-4o Mini: 17 requests queued pending deployment provisioning. Total tokens processed: 36,952. Total cost: $0.032.

Agent execution: All 7 agents created and tested with domain-specific prompts — the exact examples described above. Creative Director generated multi-platform ad variations with proper format compliance. Bid Optimizer analyzed real campaign metrics and produced actionable recommendations. Audience Analyst designed lookalike strategies with specific seed audiences and testing sequences. Every agent session was logged with full audit trails.

Concurrency: Axiom's tests ran simultaneously with three other organizations on the same Bonito backend. Zero cross-org data leakage. Zero 500 errors. Gateway correctly tagged and isolated each organization's requests.

74
gateway requests across AWS + GCP + Azure
36,952
tokens processed in production testing
$0.032
total inference cost for full test suite

The ROI Math

Ad-tech ROI breakdown: 3.5:1 return on investment

Here's the full cost breakdown, annualized:

Before Bonito: $8,200/month in AI spend ($98,400/year). Three separate billing dashboards. No cost attribution. No automation. Manual campaign analysis consuming ~3 FTE hours per day.

With Bonito: $5,800/month in AI spend (30% reduction via smart routing) plus $2,942/month platform cost ($499 Pro subscription + 7 agents at $349 each). Total: $8,742/month ($104,904/year).

The total annual cost is slightly higher than before. But here's what you get for that difference: 7 autonomous agents that automate 3 FTE hours of daily analytical work — the specific workflows described above. At $75K/year per analyst, that's $94,000 in labor savings. Plus $28,800 in AI cost reduction from routing optimization.

Total annual savings: $122,800. Platform cost: $35,304. ROI: 3.5:1.

$122,800
total annual savings (labor + AI cost reduction)
$35,304
annual platform cost (Pro + 7 agents)
3.5:1
return on investment — payback in under 4 months

The payback period is under four months. And the ROI improves as Axiom's volume grows, because routing savings scale linearly with request volume while the platform cost stays fixed.

What Ad-Tech Teams Should Take Away

Axiom's story illustrates three things that apply to any team running AI at scale across multiple clouds:

Model routing is the lowest-hanging fruit for AI cost optimization. If you're using a premium model for every request, you're almost certainly overpaying by 30-50%. Most ad-tech AI workloads are a mix of simple tasks (classification, templated generation) and complex tasks (reasoning, analysis). Match the model to the task.

Agents compound the value of cloud AI routing. Routing saves money on the requests you're already making. Agents generate new value by automating work that humans are currently doing — bid analysis, audience segmentation, client reporting, competitive intelligence, contract review, sentiment monitoring. The combination — cheaper requests plus fewer manual hours — is where the real ROI comes from.

Visibility is a prerequisite for optimization. You can't optimize what you can't measure. Before Bonito, Axiom couldn't even tell you their total AI spend with confidence. After Bonito, they can break it down by provider, by model, by agent, by project. That visibility is what makes every subsequent optimization possible.

If your programmatic advertising platform is running AI across multiple clouds without unified routing and governance, you're paying the wrong-model tax on every request. Start with a free Bonito account and connect your first provider. The routing savings alone will make the case for everything else.

Ready to manage your AI infrastructure?

Join teams using Bonito to connect, route, and optimize their AI stack.

Get started free

Related Articles