Weekly Dose #3 - AI’s New Bottleneck Is Control

May 22, 2026

📰 The Weekly Dose

Welcome back to the Weekly Dose: your 5-minute breakdown of the AI/ML news that changed how us builders should think this week.

This third edition covers 15 May to 21 May 2026. No stale benchmark victory laps. No recycled stories from the last two issues. Just the five stories that affect how you build, deploy, secure, fund, or buy AI systems.

This week: Google turned multimodal generation, coding, and commerce into AI feedback loops; prompt injection got a realistic benchmark; Nvidia showed inference demand is still eating the economy; the EU turned high-risk AI classification into practical architecture work; and OpenAI made API tokens look like venture capital.

1. Google I/O made Omni the multimodal layer and Antigravity the feedback loop

Google used I/O 2026 to launch Gemini 3.5 Flash as the new default model in Gemini and AI Mode in Search, with Gemini 3.5 Pro following later.

Google Gemini@GeminiApp

Gemini 3.5 Flash is here and it's our best model yet for getting things done quickly and efficiently. Whether you need help with everyday tasks or multi-step creative projects, Gemini 3.5 Flash navigates real-world complexity to help you take action. #GoogleIO

5:24 PM · May 19, 2026 · 658K Views

75 Replies · 165 Reposts · 1.25K Likes

The bigger product signal was that Google also introduced Gemini Omni, starting with Omni Flash, a new model family that can generate video clips from prompts containing text, photos, video, and audio. Google says Omni is moving toward “create anything from any input,” with Omni Flash rolling out in the Gemini app, Google Flow, and YouTube Shorts.

Google DeepMind@GoogleDeepMind

We’re dropping Gemini Omni: our first step towards a model that can create anything from anything - starting with video. It combines Gemini’s intelligence with our generative media systems - representing a leap forward in world understanding, multimodality, and editing 🧵

5:17 PM · May 19, 2026 · 1.02M Views

284 Replies · 1.26K Reposts · 8.27K Likes

That matters because native multimodality changes the architecture. Instead of stitching together transcription, image understanding, video generation, editing tools, and post-processing, the direction of travel is one model family that can reason across input types and produce media directly. Google is starting with video, but the strategic direction is broader: multimodal inputs, multimodal outputs, fewer brittle pipeline handoffs.

The second builder signal is Antigravity. Business Insider reported that Gemini 3.5 Flash is now the main model powering Google’s Antigravity AI coding service, giving Google real-world developer feedback before the delayed Pro model ships. Coding creates unusually clean model-improvement signals: tests pass, builds fail, tasks get abandoned, patches get accepted. That is better training fuel than “the chatbot felt helpful.”

Google Antigravity@antigravity

Introducing Antigravity 2.0, a new standalone desktop application that delivers fully on that original glimpse of a truly agent-optimized experience. Rebuilt from the ground up with multi-agent teams, scheduled tasks, native voice and one-click integration with other Google

5:52 PM · May 19, 2026 · 2.26M Views

1.6K Replies · 1.03K Reposts · 10.1K Likes

Google also pushed Universal Cart, a cross-platform shopping layer across Search, Gemini, YouTube, and Gmail, with rollout planned in the U.S. in summer 2026 and launch partners including Nike, Sephora, Target, Ulta Beauty, Walmart, Wayfair, and some Shopify merchants. Gemini Spark is expected to connect into that shopping layer so agents can compare prices, check stock, create alerts, and eventually handle more of the buying journey.

Google@Google

We’re introducing Universal Cart — a new hub for shopping on Google. 🛍️ It will work across merchants and across services. 🛒 You’ll be able to add things to your cart whether you're shopping on Search, the @GeminiApp, @YouTube or @gmail. 🛠️ The moment you add a product to

6:05 PM · May 19, 2026 · 254K Views

47 Replies · 199 Reposts · 2.16K Likes

🫵 Why it matters to you:
If you build with media, the old “text model plus wrappers” approach is starting to look temporary. If you build commerce, product metadata and inventory freshness are becoming agent-facing infrastructure. If you build developer tools, coding workflows are now feedback loops for model improvement.

🤫 The subtext nobody says out loud:
Google is not just launching features. It is trying to own the surfaces where high-quality multimodal and workflow feedback is created: video, code, shopping, search, and everyday productivity.

2. LivePI proved prompt injection is an engineering bottleneck

On 18 May, researchers released LivePI, a benchmark for indirect prompt injection in production-like agent environments. It tests agents across seven input surfaces, twelve attack/rendering families, and five malicious goals, including protected-information exfiltration, unauthorized security-control changes, unsafe code execution, inbox-summary exfiltration, and cryptocurrency transfer.

Across GPT-5.3-Codex, Claude Opus 4.6, Gemini 3.1 Pro, Kimi K2.5, and GLM-5, attack success rates ranged from 10.7% to 29.6%. The strongest mitigation in the benchmark was not “better prompting.” It was a two-layer runtime defense: prompt-level filtering plus deterministic pre-execution authorization before tool calls.

🫵 Why it matters to you:
Agents that read email, repos, group chats, files, tickets, webpages, and wallets are constantly ingesting hostile text. Prompt injection is not a jailbreak gimmick. It is input validation for autonomous software.

🤫 The subtext nobody says out loud:
“Let the agent decide if this tool call is safe” is not a security model. That is wishful thinking with API keys attached.

🛠️ Practical takeaways:

Add approval gates before any agent can write files, change configs, send messages, access secrets, run code, move money, deploy changes, or transfer data.
Treat external text from email, Slack, GitHub, webpages, PDFs, and tickets as untrusted input.
Build workflow-level attack tests, not just prompt-level jailbreak tests.

3. Nvidia showed the AI factory boom still has room to run

Nvidia reported $81.62 billion in quarterly revenue, up 85% year over year, with net income rising to $58.32 billion. The company forecast about $91 billion in revenue for the current quarter, announced an $80 billion stock buyback, and raised its quarterly dividend. Jensen Huang described the buildout of AI infrastructure as the “largest infrastructure expansion in human history.”

This is not just another “chips are hot” headline. It says the market is still treating inference, training, networking, systems, and data-center buildout as production infrastructure. If your AI product depends on model availability, rate limits, latency, batch jobs, or agentic workloads, you are downstream of this buildout.

🫵 Why it matters to you:
Your AI vendor’s infrastructure position affects your latency, uptime, rate limits, batch capacity, and pricing. Model quality only matters if there is enough compute behind it to serve real workloads.

🤫 The subtext nobody says out loud:
AI subscriptions are compute allocation plans dressed up as productivity software. The real product is not just the model; it is guaranteed access to inference capacity when your users need it.

4. The EU turned high-risk AI classification into product architecture

On 19 May, the European Commission published draft guidelines for classifying high-risk AI systems under Article 6 of the AI Act. The guidelines are aimed at providers, deployers, and market surveillance authorities, and include practical examples of systems that should or should not be classified as high-risk.

The Commission splits high-risk classification into two paths: AI used as a safety component or regulated product under Annex I, and AI systems that fall into Annex III use cases. That means teams building in areas like hiring, education, healthcare, credit, biometrics, critical infrastructure, migration, or safety components need to treat classification as a design input, not a legal cleanup task.

🫵 Why it matters to you:
If your AI system falls into a high-risk category, the architecture needs to support evidence: logs, oversight, monitoring, documentation, risk controls, and traceability. You cannot bolt that on cleanly after launch.

🤫 The subtext nobody says out loud:
Compliance is becoming data architecture. The teams that generate audit evidence by default will move faster than teams trying to reconstruct it from Slack threads and dashboard screenshots.

5. OpenAI turned API tokens into startup capital

On 20 May, Business Insider reported that OpenAI is offering $2 million in API tokens to startups in Y Combinator’s current batch in exchange for equity. The pilot applies to YC’s spring and summer 2026 batches and uses an uncapped SAFE without a most-favored-nation clause.

Tyler Bosmeny@bosmeny

A mic drop moment @ycombinator tonight @sama just offered $2M in OpenAI tokens to EVERY YC startup in the current batch in exchange for equity Just like Yuri Milner offering to invest in every startup back when Sam was a YC partner I can't wait to see what's unlocked when you

1:46 AM · May 20, 2026 · 2.12M Views

245 Replies · 128 Reposts · 2K Likes

This is a very AI-native twist on venture financing. Tokens are not just “cloud credits” when your startup’s product depends on long-context agents, coding loops, retrieval, multimodal generation, or inference-heavy user sessions. They are inventory.

🫵 Why it matters to you:
If your product burns tokens to serve users, token access shapes runway as much as cash. Your architecture, model routing, context strategy, and vendor dependency now directly affect financing strategy.

🤫 The subtext nobody says out loud:
The platform wants to be your investor, your cloud provider, and your infrastructure layer before you hit product-market fit. That can be useful. It can also make switching costs existential.

💡 Our take

This week’s theme is simple: AI’s new bottleneck is control.

Google wants control of the multimodal, coding, shopping, and search surfaces that generate model feedback. LivePI shows why agents need control boundaries before they touch tools. Nvidia’s earnings show infrastructure control is still the economic center of AI. The EU is turning regulatory control into system design. OpenAI’s YC token deal shows financial control is starting to run through API access.

The model still matters. But the model is no longer the whole product.

The key signals from this week:

Native multimodality is becoming the default architecture. Gemini Omni points toward fewer stitched-together media pipelines and more end-to-end multimodal systems.
Workflow ownership is becoming model advantage. Coding, shopping, search, and productivity tools create feedback loops that benchmarks cannot.
Agent security is moving from prompt advice to runtime controls. Tool-call authorization, permissions, and logging are now core product features.
Compute access is still strategic infrastructure. Latency, limits, uptime, and capacity are part of the user experience.
Compliance needs to be designed into the system. High-risk classification affects architecture, telemetry, oversight, and evidence.
Token burn is becoming startup finance. For AI-native companies, inference cost is a capital-planning problem, not a miscellaneous line item.

The better question is no longer “which model is best?”

It is: who controls the workflow, the modality layer, the tool boundary, the compute, the evidence, and the unit economics?

📌 Your to-do list

Audit multimodal pipelines. Identify where separate transcription, image analysis, video generation, editing, and post-processing steps could be replaced by native multimodal models.
Test Gemini Omni-style workflows. Run small experiments on media tasks that currently require brittle chains: video edits, product demos, creative variants, training content, support explainers, and marketing assets.
Audit agent tool permissions. List every action your agents can take: file writes, code execution, message sending, payment, deployment, data export, config changes, and credential access.
Add deterministic authorization gates. Do not let the model be the final authority on whether a risky tool call is safe.
Benchmark vendors on operating limits. Track latency, rate limits, uptime, batch support, logging, regional availability, and cost per completed workflow.
Review product metadata for agentic commerce. If your product needs to be discovered, compared, or purchased by agents, clean inventory data, product attributes, compatibility fields, reviews, and availability feeds.
Classify EU-facing AI systems. Map your products against Annex I and Annex III risk categories before the compliance work becomes urgent.
Model token burn like COGS. Track cost per user action, cost per completed agent task, context-window waste, retry cost, and vendor concentration.

See you next week.

Discussion about this post

Ready for more?