📰 The Weekly Dose
Welcome back to the Weekly Dose: your 5-minute breakdown of the AI/ML news that changed how us builders should think this week.
This third edition covers 15 May to 21 May 2026. No stale benchmark victory laps. No recycled stories from the last two issues. Just the five stories that affect how you build, deploy, secure, fund, or buy AI systems.
This week: Google turned multimodal generation, coding, and commerce into AI feedback loops; prompt injection got a realistic benchmark; Nvidia showed inference demand is still eating the economy; the EU turned high-risk AI classification into practical architecture work; and OpenAI made API tokens look like venture capital.
1. Google I/O made Omni the multimodal layer and Antigravity the feedback loop
Google used I/O 2026 to launch Gemini 3.5 Flash as the new default model in Gemini and AI Mode in Search, with Gemini 3.5 Pro following later.
The bigger product signal was that Google also introduced Gemini Omni, starting with Omni Flash, a new model family that can generate video clips from prompts containing text, photos, video, and audio. Google says Omni is moving toward “create anything from any input,” with Omni Flash rolling out in the Gemini app, Google Flow, and YouTube Shorts.
That matters because native multimodality changes the architecture. Instead of stitching together transcription, image understanding, video generation, editing tools, and post-processing, the direction of travel is one model family that can reason across input types and produce media directly. Google is starting with video, but the strategic direction is broader: multimodal inputs, multimodal outputs, fewer brittle pipeline handoffs.
The second builder signal is Antigravity. Business Insider reported that Gemini 3.5 Flash is now the main model powering Google’s Antigravity AI coding service, giving Google real-world developer feedback before the delayed Pro model ships. Coding creates unusually clean model-improvement signals: tests pass, builds fail, tasks get abandoned, patches get accepted. That is better training fuel than “the chatbot felt helpful.”
Google also pushed Universal Cart, a cross-platform shopping layer across Search, Gemini, YouTube, and Gmail, with rollout planned in the U.S. in summer 2026 and launch partners including Nike, Sephora, Target, Ulta Beauty, Walmart, Wayfair, and some Shopify merchants. Gemini Spark is expected to connect into that shopping layer so agents can compare prices, check stock, create alerts, and eventually handle more of the buying journey.
🫵 Why it matters to you:
If you build with media, the old “text model plus wrappers” approach is starting to look temporary. If you build commerce, product metadata and inventory freshness are becoming agent-facing infrastructure. If you build developer tools, coding workflows are now feedback loops for model improvement.
🤫 The subtext nobody says out loud:
Google is not just launching features. It is trying to own the surfaces where high-quality multimodal and workflow feedback is created: video, code, shopping, search, and everyday productivity.
2. LivePI proved prompt injection is an engineering bottleneck
On 18 May, researchers released LivePI, a benchmark for indirect prompt injection in production-like agent environments. It tests agents across seven input surfaces, twelve attack/rendering families, and five malicious goals, including protected-information exfiltration, unauthorized security-control changes, unsafe code execution, inbox-summary exfiltration, and cryptocurrency transfer.
Across GPT-5.3-Codex, Claude Opus 4.6, Gemini 3.1 Pro, Kimi K2.5, and GLM-5, attack success rates ranged from 10.7% to 29.6%. The strongest mitigation in the benchmark was not “better prompting.” It was a two-layer runtime defense: prompt-level filtering plus deterministic pre-execution authorization before tool calls.
🫵 Why it matters to you:
Agents that read email, repos, group chats, files, tickets, webpages, and wallets are constantly ingesting hostile text. Prompt injection is not a jailbreak gimmick. It is input validation for autonomous software.
🤫 The subtext nobody says out loud:
“Let the agent decide if this tool call is safe” is not a security model. That is wishful thinking with API keys attached.
🛠️ Practical takeaways:
Add approval gates before any agent can write files, change configs, send messages, access secrets, run code, move money, deploy changes, or transfer data.
Treat external text from email, Slack, GitHub, webpages, PDFs, and tickets as untrusted input.
Build workflow-level attack tests, not just prompt-level jailbreak tests.
3. Nvidia showed the AI factory boom still has room to run
Nvidia reported $81.62 billion in quarterly revenue, up 85% year over year, with net income rising to $58.32 billion. The company forecast about $91 billion in revenue for the current quarter, announced an $80 billion stock buyback, and raised its quarterly dividend. Jensen Huang described the buildout of AI infrastructure as the “largest infrastructure expansion in human history.”
This is not just another “chips are hot” headline. It says the market is still treating inference, training, networking, systems, and data-center buildout as production infrastructure. If your AI product depends on model availability, rate limits, latency, batch jobs, or agentic workloads, you are downstream of this buildout.
🫵 Why it matters to you:
Your AI vendor’s infrastructure position affects your latency, uptime, rate limits, batch capacity, and pricing. Model quality only matters if there is enough compute behind it to serve real workloads.
🤫 The subtext nobody says out loud:
AI subscriptions are compute allocation plans dressed up as productivity software. The real product is not just the model; it is guaranteed access to inference capacity when your users need it.
4. The EU turned high-risk AI classification into product architecture
On 19 May, the European Commission published draft guidelines for classifying high-risk AI systems under Article 6 of the AI Act. The guidelines are aimed at providers, deployers, and market surveillance authorities, and include practical examples of systems that should or should not be classified as high-risk.
The Commission splits high-risk classification into two paths: AI used as a safety component or regulated product under Annex I, and AI systems that fall into Annex III use cases. That means teams building in areas like hiring, education, healthcare, credit, biometrics, critical infrastructure, migration, or safety components need to treat classification as a design input, not a legal cleanup task.
🫵 Why it matters to you:
If your AI system falls into a high-risk category, the architecture needs to support evidence: logs, oversight, monitoring, documentation, risk controls, and traceability. You cannot bolt that on cleanly after launch.
🤫 The subtext nobody says out loud:
Compliance is becoming data architecture. The teams that generate audit evidence by default will move faster than teams trying to reconstruct it from Slack threads and dashboard screenshots.
5. OpenAI turned API tokens into startup capital
On 20 May, Business Insider reported that OpenAI is offering $2 million in API tokens to startups in Y Combinator’s current batch in exchange for equity. The pilot applies to YC’s spring and summer 2026 batches and uses an uncapped SAFE without a most-favored-nation clause.
This is a very AI-native twist on venture financing. Tokens are not just “cloud credits” when your startup’s product depends on long-context agents, coding loops, retrieval, multimodal generation, or inference-heavy user sessions. They are inventory.
🫵 Why it matters to you:
If your product burns tokens to serve users, token access shapes runway as much as cash. Your architecture, model routing, context strategy, and vendor dependency now directly affect financing strategy.
🤫 The subtext nobody says out loud:
The platform wants to be your investor, your cloud provider, and your infrastructure layer before you hit product-market fit. That can be useful. It can also make switching costs existential.
💡 Our take
This week’s theme is simple: AI’s new bottleneck is control.
Google wants control of the multimodal, coding, shopping, and search surfaces that generate model feedback. LivePI shows why agents need control boundaries before they touch tools. Nvidia’s earnings show infrastructure control is still the economic center of AI. The EU is turning regulatory control into system design. OpenAI’s YC token deal shows financial control is starting to run through API access.
The model still matters. But the model is no longer the whole product.
The key signals from this week:
Native multimodality is becoming the default architecture. Gemini Omni points toward fewer stitched-together media pipelines and more end-to-end multimodal systems.
Workflow ownership is becoming model advantage. Coding, shopping, search, and productivity tools create feedback loops that benchmarks cannot.
Agent security is moving from prompt advice to runtime controls. Tool-call authorization, permissions, and logging are now core product features.
Compute access is still strategic infrastructure. Latency, limits, uptime, and capacity are part of the user experience.
Compliance needs to be designed into the system. High-risk classification affects architecture, telemetry, oversight, and evidence.
Token burn is becoming startup finance. For AI-native companies, inference cost is a capital-planning problem, not a miscellaneous line item.
The better question is no longer “which model is best?”
It is: who controls the workflow, the modality layer, the tool boundary, the compute, the evidence, and the unit economics?
📌 Your to-do list
Audit multimodal pipelines. Identify where separate transcription, image analysis, video generation, editing, and post-processing steps could be replaced by native multimodal models.
Test Gemini Omni-style workflows. Run small experiments on media tasks that currently require brittle chains: video edits, product demos, creative variants, training content, support explainers, and marketing assets.
Audit agent tool permissions. List every action your agents can take: file writes, code execution, message sending, payment, deployment, data export, config changes, and credential access.
Add deterministic authorization gates. Do not let the model be the final authority on whether a risky tool call is safe.
Benchmark vendors on operating limits. Track latency, rate limits, uptime, batch support, logging, regional availability, and cost per completed workflow.
Review product metadata for agentic commerce. If your product needs to be discovered, compared, or purchased by agents, clean inventory data, product attributes, compatibility fields, reviews, and availability feeds.
Classify EU-facing AI systems. Map your products against Annex I and Annex III risk categories before the compliance work becomes urgent.
Model token burn like COGS. Track cost per user action, cost per completed agent task, context-window waste, retry cost, and vendor concentration.
See you next week.










