<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Machine Learning Pills: Weekly Dose]]></title><description><![CDATA[A curated injection of the week’s top AI and ML developments, filtered for relevance and distilled into actionable insights you can actually apply to your workflow.]]></description><link>https://mlpills.substack.com/s/weekly-dose</link><image><url>https://substackcdn.com/image/fetch/$s_!yCAU!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1b8efe1d-e165-4098-9fcc-b465f7286f50_1063x1063.png</url><title>Machine Learning Pills: Weekly Dose</title><link>https://mlpills.substack.com/s/weekly-dose</link></image><generator>Substack</generator><lastBuildDate>Sun, 17 May 2026 18:40:57 GMT</lastBuildDate><atom:link href="https://mlpills.substack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[MLPills]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[mlpills@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[mlpills@substack.com]]></itunes:email><itunes:name><![CDATA[David Andrés]]></itunes:name></itunes:owner><itunes:author><![CDATA[David Andrés]]></itunes:author><googleplay:owner><![CDATA[mlpills@substack.com]]></googleplay:owner><googleplay:email><![CDATA[mlpills@substack.com]]></googleplay:email><googleplay:author><![CDATA[David Andrés]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Weekly Dose #2 - The AI Race Moved From Models to Deployment]]></title><description><![CDATA[OpenAI turned enterprise deployment into a $4B+ services machine, Anthropic proved last week&#8217;s finance-agent push was the start of a vertical packaging machine, Codex and Claude Code turned pricing pages into competitive weapons, Google reframed Android as an agentic execution layer, and OpenAI Daybreak pushed AI security from &#8220;model capability&#8221; into workflow product.]]></description><link>https://mlpills.substack.com/p/weekly-dose-2-the-ai-race-moved-from</link><guid isPermaLink="false">https://mlpills.substack.com/p/weekly-dose-2-the-ai-race-moved-from</guid><dc:creator><![CDATA[David Andrés]]></dc:creator><pubDate>Fri, 15 May 2026 06:31:18 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/a23812f0-93b5-4631-a362-333a72b59d54_1871x841.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1>&#128240; The Weekly Dose</h1><p><em>Welcome back to the Weekly Dose: your 5-minute breakdown of the AI/ML news that changed how us builders should think this week.</em></p><p><em>This second edition covers <strong>7 May to 14 May 2026</strong>. No stale benchmark victory laps. No &#8220;agents will change everything&#8221; filler. Just the five stories that affect how you build, deploy, secure, or buy AI systems.</em></p><p><em><strong>This week</strong>: OpenAI turned enterprise deployment into a $4B+ services machine, Anthropic proved last week&#8217;s finance-agent push was the start of a vertical packaging machine, Codex and Claude Code turned pricing pages into competitive weapons, Google reframed Android as an agentic execution layer, and OpenAI Daybreak pushed AI security from &#8220;model capability&#8221; into workflow product.</em></p><div class="callout-block" data-callout="true"><p style="text-align: center;">Interested in <strong>sponsoring</strong> this section (or any others)? <strong>Contact me</strong> here:</p><div class="directMessage button" data-attrs="{&quot;userId&quot;:38707812,&quot;userName&quot;:&quot;David Andr&#233;s&quot;,&quot;canDm&quot;:null,&quot;dmUpgradeOptions&quot;:null,&quot;isEditorNode&quot;:true}" data-component-name="DirectMessageToDOM"></div></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!K9im!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6a891c0-f465-4385-8a11-b8b94063b38b_1774x887.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!K9im!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6a891c0-f465-4385-8a11-b8b94063b38b_1774x887.png 424w, https://substackcdn.com/image/fetch/$s_!K9im!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6a891c0-f465-4385-8a11-b8b94063b38b_1774x887.png 848w, https://substackcdn.com/image/fetch/$s_!K9im!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6a891c0-f465-4385-8a11-b8b94063b38b_1774x887.png 1272w, https://substackcdn.com/image/fetch/$s_!K9im!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6a891c0-f465-4385-8a11-b8b94063b38b_1774x887.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!K9im!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6a891c0-f465-4385-8a11-b8b94063b38b_1774x887.png" width="1456" height="728" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6a891c0-f465-4385-8a11-b8b94063b38b_1774x887.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:728,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1485378,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://mlpills.substack.com/i/197693852?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6a891c0-f465-4385-8a11-b8b94063b38b_1774x887.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!K9im!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6a891c0-f465-4385-8a11-b8b94063b38b_1774x887.png 424w, https://substackcdn.com/image/fetch/$s_!K9im!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6a891c0-f465-4385-8a11-b8b94063b38b_1774x887.png 848w, https://substackcdn.com/image/fetch/$s_!K9im!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6a891c0-f465-4385-8a11-b8b94063b38b_1774x887.png 1272w, https://substackcdn.com/image/fetch/$s_!K9im!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6a891c0-f465-4385-8a11-b8b94063b38b_1774x887.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2>1. OpenAI is absorbing the systems integrator layer</h2><p>OpenAI launched the <strong>OpenAI Deployment Company</strong> on <strong>May 11</strong>, a new majority-controlled enterprise deployment arm designed to help companies put OpenAI systems into real operations. The company starts with an initial <strong>$4B investment</strong>, backed by 19 financial and consulting partners including BBVA, TPG, Advent, Bain Capital, Brookfield, Goldman Sachs, McKinsey, and Capgemini. OpenAI is also acquiring <strong>Tomoro</strong>, an AI engineering and consulting firm with around <strong>150 specialists</strong>, to give the new company implementation capacity from day one.</p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/OpenAI/status/2053824997777457651?s=20&quot;,&quot;full_text&quot;:&quot;Today we&#8217;re launching the OpenAI Deployment Company to help businesses build and deploy AI.\n\nIt's majority-owned and controlled by OpenAI. It brings together 19 leading investment firms, consultancies, and system integrators to help organizations deploy frontier AI to production&quot;,&quot;username&quot;:&quot;OpenAI&quot;,&quot;name&quot;:&quot;OpenAI&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/1885410181409820672/ztsaR0JW_normal.jpg&quot;,&quot;date&quot;:&quot;2026-05-11T13:10:12.000Z&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:663,&quot;retweet_count&quot;:1526,&quot;like_count&quot;:11359,&quot;impression_count&quot;:7766419,&quot;expanded_url&quot;:null,&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><p>This is not another &#8220;enterprise AI platform&#8221; announcement. It is OpenAI moving directly into the deployment work that usually sits with systems integrators, consultancies, and internal transformation teams: workflow mapping, data access, security review, implementation, governance, and actually getting the thing used.</p><div class="callout-block" data-callout="true"><p>&#129781; <strong>Why it matters to you:</strong> If your AI roadmap depends on months of custom glue code, internal enablement, stakeholder wrangling, and integration work, your vendor landscape just changed. OpenAI is not only selling the model anymore. It is selling the implementation path.</p><p>&#129323; <strong>The subtext nobody says out loud:</strong> The frontier labs have discovered the least glamorous truth in enterprise software: distribution without deployment is just shelfware. The next fight is not only who has the smartest model. It is who can walk into a bank, insurer, retailer, or telco and make the model survive procurement, security, compliance, and daily workflow reality.</p></div><div><hr></div><h2>2. Anthropic turned last week&#8217;s finance-agent playbook into a vertical packaging machine</h2><p>Last week, Anthropic packaged finance workflows. This week, it proved that wasn&#8217;t a one-off.</p><p>On <strong>May 12</strong>, Anthropic expanded Claude for legal work, adding Claude Cowork tools and integrations for legal research, contracts, documents, case law, and practice-specific workflows. Reported integrations include tools such as CourtListener, Thomson Reuters Westlaw, Box, Harvey, and others, plus pre-built legal skills for areas such as employment, privacy, product law, legal clinics, and legal education.</p><p>Then on <strong>May 13</strong>, Anthropic launched <strong>Claude for Small Business</strong>, connecting Claude to tools like QuickBooks, PayPal, HubSpot, Canva, DocuSign, Google Workspace, and Microsoft 365. The package runs through Claude Cowork and includes built-in workflows across finance, sales, HR, marketing, operations, and customer service.</p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/evansantoslaw/status/2054631479628419129?s=20&quot;,&quot;full_text&quot;:&quot;<span class=\&quot;tweet-fake-link\&quot;>@AnthropicAI</span> has launched Claude for Small Business, with 15 ready-to-run workflows spanning QuickBooks, PayPal, HubSpot, Canva, DocuSign, Google Workspace, and Microsoft 365.\n\nThat could help level the playing field for smaller teams, especially with free AI fluency training&quot;,&quot;username&quot;:&quot;evansantoslaw&quot;,&quot;name&quot;:&quot;Evan Santos&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/2042331008268120064/IZyyZu3w_normal.jpg&quot;,&quot;date&quot;:&quot;2026-05-13T18:34:52.000Z&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:4,&quot;retweet_count&quot;:11,&quot;like_count&quot;:70,&quot;impression_count&quot;:44723,&quot;expanded_url&quot;:null,&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><p>The move is clear: Anthropic is turning Claude from a general-purpose assistant into a set of pre-wired business surfaces. Finance last week. Legal and small business this week. More verticals will follow.</p><div class="callout-block" data-callout="true"><p>&#129781; <strong>Why it matters to you:</strong> If you&#8217;re building internal AI tools for legal, accounting, HR, sales ops, compliance, or SMB workflows, &#8220;we wrapped a model around our docs&#8221; is no longer enough. The new baseline is model + connectors + permissions + approval gates + workflow templates.</p><p>&#129323; <strong>The subtext nobody says out loud:</strong> Anthropic is not just selling Claude. It is selling migration pain in reverse. Once Claude is wired into a team&#8217;s documents, legal research, CRM, invoicing, email, and approval flows, switching vendors becomes an integration problem; not a model-quality debate.</p></div><div><hr></div><h2>3. Codex and Claude Code made pricing pages the new battleground</h2><p>On <strong>May 14</strong>, OpenAI and Anthropic moved almost simultaneously on AI coding tools. OpenAI offered companies <strong>two months of free Codex usage</strong> if they sign up within 30 days. </p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/OpenAIDevs/status/2054586214112780518?s=20&quot;,&quot;full_text&quot;:&quot;Want to (officially) use Codex at work?\n\nSend this post to your CTO to bring your team to Codex. Eligible enterprise customers who switch in the next 30 days get 2 free months of Codex usage for new users. &quot;,&quot;username&quot;:&quot;OpenAIDevs&quot;,&quot;name&quot;:&quot;OpenAI Developers&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/2022002720971096064/l3Kyt4qt_normal.jpg&quot;,&quot;date&quot;:&quot;2026-05-13T15:35:00.000Z&quot;,&quot;photos&quot;:[{&quot;img_url&quot;:&quot;https://pbs.substack.com/media/HINbRDoa4AAwH7N.jpg&quot;,&quot;link_url&quot;:&quot;https://t.co/38e8y7MAmg&quot;},{&quot;img_url&quot;:&quot;https://pbs.substack.com/media/HINbRYSa0AAob-c.jpg&quot;,&quot;link_url&quot;:&quot;https://t.co/38e8y7MAmg&quot;}],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:220,&quot;retweet_count&quot;:269,&quot;like_count&quot;:3813,&quot;impression_count&quot;:1026660,&quot;expanded_url&quot;:null,&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><p>Less than an hour later, Anthropic increased <strong>Claude Code weekly usage limits by 50%</strong> for Pro, Max, Team, and Enterprise users until <strong>July 13</strong>.</p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/ClaudeDevs/status/2054639777685934564?s=20&quot;,&quot;full_text&quot;:&quot;Claude Code weekly limits are increasing 50%, now through July 13.\n\nLive now for all Pro, Max, Team, and seat-based Enterprise users. &quot;,&quot;username&quot;:&quot;ClaudeDevs&quot;,&quot;name&quot;:&quot;ClaudeDevs&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/2044472418815893504/xf14RxM8_normal.png&quot;,&quot;date&quot;:&quot;2026-05-13T19:07:51.000Z&quot;,&quot;photos&quot;:[{&quot;img_url&quot;:&quot;https://pbs.substack.com/media/HIOLKb7bsAAXtoD.jpg&quot;,&quot;link_url&quot;:&quot;https://t.co/5nU0XX4RZY&quot;}],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:1130,&quot;retweet_count&quot;:1841,&quot;like_count&quot;:19729,&quot;impression_count&quot;:2124816,&quot;expanded_url&quot;:null,&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><p>Axios also reported that Anthropic is putting some outside agent-tool usage behind a separate credit meter, highlighting a broader shift away from simple &#8220;all-you-can-eat&#8221; AI subscriptions as agents consume far more compute than normal chat usage.</p><p>This is not a feature war anymore. It is a retention war. Free usage windows, temporary limit boosts, separate credits, agent meters, and plan-specific caps are becoming the actual product surface developers feel every day.</p><div class="callout-block" data-callout="true"><p>&#129781; <strong>Why it matters to you:</strong> If you use Codex, Claude Code, Cursor, Devin, or similar tools for real engineering work, you should re-benchmark under the current limits. Measure completion rate, retry rate, wall-clock time, approval interruptions, and cost per finished task, not just &#8220;which one felt smarter in a demo.&#8221;</p><p>&#129323; <strong>The subtext nobody says out loud:</strong> The coding-agent fight is moving from capability demos to usage economics. Developers do not abandon tools only because the model makes mistakes. They abandon them when the agent runs out of runway halfway through a migration, refactor, or bug hunt.</p></div><div><hr></div><h2>4. Google turned Android into an agentic execution layer</h2><p>Google used its <strong>Android Show: I/O Edition</strong> on <strong>May 12</strong> to preview a major Android + Gemini Intelligence push. The upgrades include Gemini Intelligence across Android phones, Chrome auto-browse, smarter Autofill, AI-generated widgets, Gboard&#8217;s <strong>Rambler</strong> dictation cleanup, and deeper Android Auto features.</p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/Android/status/2054252834523115988?s=20&quot;,&quot;full_text&quot;:&quot;This was one of the biggest years for Android yet, and <span class=\&quot;tweet-fake-link\&quot;>#TheAndroidShow</span> was packed with new updates to make your everyday Android experience even better.\n\nHere are just a few&#8230; &#128071;&#129525;&quot;,&quot;username&quot;:&quot;Android&quot;,&quot;name&quot;:&quot;Android&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/2052036388401664000/e1gu5tkc_normal.jpg&quot;,&quot;date&quot;:&quot;2026-05-12T17:30:16.000Z&quot;,&quot;photos&quot;:[],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:59,&quot;retweet_count&quot;:77,&quot;like_count&quot;:1721,&quot;impression_count&quot;:190653,&quot;expanded_url&quot;:null,&quot;video_url&quot;:null,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><p>The important shift is that Gemini is being positioned less as a chatbot and more as an action layer. Google described examples like interacting directly with apps, turning lists into shopping baskets, ordering food, filling complex forms, and queuing actions for final user confirmation. Chrome auto-browse is expected to bring similar automation to websites from late June.</p><p>Android Auto is also getting a broader AI and interface upgrade, including deeper Gemini integration, richer widgets, full-screen support for unusual car displays, and more immersive Google Maps navigation.</p><div class="callout-block" data-callout="true"><p>&#129781; <strong>Why it matters to you:</strong> If you build consumer apps, Android products, mobile commerce flows, or services that rely on users manually tapping through screens, the interface contract is changing. Your app needs clear intents, clean state, sensible permissions, and confirmation flows that an assistant can operate safely.</p><p>&#129323; <strong>The subtext nobody says out loud:</strong> Google does not need Gemini to win every benchmark if Gemini is already where the user acts. The model embedded in the phone, browser, keyboard, car, and laptop has a distribution advantage that a smarter model in a separate tab has to fight uphill to beat.</p></div><div><hr></div><h2>5. OpenAI Daybreak made AI security a workflow product</h2><p>OpenAI launched <strong>Daybreak</strong> on <strong>May 11 US time</strong> as a cybersecurity initiative focused on finding and fixing software vulnerabilities before attackers exploit them. Daybreak uses OpenAI models, Codex, Codex Security, and security partners to create threat models from code, identify likely attack paths, validate vulnerabilities, and automate detection of higher-risk issues.</p><p>This is OpenAI&#8217;s clearest answer to Anthropic&#8217;s Claude Mythos / Project Glasswing push. The Verge reports that Daybreak involves specialized cyber models including GPT-5.5 with Trusted Access for Cyber and GPT-5.5-Cyber, which began rolling out to vetted cyber defenders.</p><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/OpenAI/status/2053939702110269822?s=20&quot;,&quot;full_text&quot;:&quot;Introducing Daybreak: frontier AI for cyber defenders.\n\nDaybreak brings together the most capable OpenAI models, Codex, and our security partners to accelerate cyber defense and continuously secure software.\n\nA step toward a future where security teams can move at the speed &quot;,&quot;username&quot;:&quot;OpenAI&quot;,&quot;name&quot;:&quot;OpenAI&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/1885410181409820672/ztsaR0JW_normal.jpg&quot;,&quot;date&quot;:&quot;2026-05-11T20:45:59.000Z&quot;,&quot;photos&quot;:[{&quot;img_url&quot;:&quot;https://substackcdn.com/image/upload/w_1028,c_limit,q_auto:best/l_twitter_play_button_rvaygk,w_88/c8ql8yfntj3p6e2tazdr&quot;,&quot;link_url&quot;:&quot;https://t.co/AGfXhmJb5E&quot;}],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:624,&quot;retweet_count&quot;:1142,&quot;like_count&quot;:11372,&quot;impression_count&quot;:5451838,&quot;expanded_url&quot;:null,&quot;video_url&quot;:&quot;https://video.twimg.com/amplify_video/2053933700275167232/vid/avc1/1108x720/1kPpYAbbx07qSSVd.mp4&quot;,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><p>The key difference: Daybreak is not just &#8220;a cyber model.&#8221; It is a workflow wrapper around the model: repo context, threat modeling, vulnerability validation, detection, and eventually patching. That makes it much more interesting for AppSec teams than another leaderboard screenshot.</p><div class="callout-block" data-callout="true"><p>&#129781; <strong>Why it matters to you:</strong> If you work in AppSec, detection engineering, platform engineering, or AI governance, you should start planning for AI systems that can inspect codebases, reason across attack paths, and propose mitigations. The blocker will not only be model quality. It will be access control, logging, sandboxing, patch review, and trust.</p><p>&#129323; <strong>The subtext nobody says out loud:</strong> Security vendors are about to be judged on whether they can close the loop. Finding issues is table stakes. The new question is: can the system understand exploitability, prioritize the real risk, generate a fix, validate it, and leave a clean audit trail?</p><p>&#128736;&#65039; <strong>Practical takeaways:</strong></p><ul><li><p>Define which repos an AI security agent should be allowed to inspect.</p></li><li><p>Decide what the agent can do automatically versus what requires human approval.</p></li><li><p>Require logs for code access, generated findings, attempted reproduction, and patch suggestions.</p></li><li><p>Keep patch generation separate from patch merge until your review process is mature.</p></li><li><p>Start building evals for vulnerability triage quality, not just code-generation quality.</p></li></ul></div><div><hr></div><p><em>So, what does all of this mean in practice? Here&#8217;s <strong>our take</strong>, and <strong>your to-do list</strong> for the week ahead.</em></p><h1>&#128161; Our take</h1><p>This week&#8217;s theme is simple: <strong>AI is moving from capability into distribution, packaging, and deployment.</strong></p><p>OpenAI is building the deployment muscle. Anthropic is packaging vertical workflows. Google is embedding Gemini where users already act. Codex and Claude Code are fighting over limits, credits, and retention. Daybreak turns cyber capability into an operational workflow.</p><p><em>The model is still important. But the model is no longer the product by itself.</em></p><p>The key signals from this week:</p><ul><li><p><strong>Deployment is becoming part of the AI product.</strong> The OpenAI Deployment Company is a bet that enterprise AI needs implementation capacity, not just APIs.</p></li><li><p><strong>Vertical workflow packaging is accelerating.</strong> Anthropic moved from finance to legal and small business in back-to-back weeks.</p></li><li><p><strong>Coding-agent competition is now commercial infrastructure.</strong> Limits, credits, promos, and metering shape whether developers can actually use the tools.</p></li><li><p><strong>Distribution is becoming agentic.</strong> Google is turning Android into a surface where Gemini can act, not just answer.</p></li><li><p><strong>AI security is becoming a managed workflow.</strong> Daybreak points to security agents that reason across repos, attack paths, detection, and fixes.</p></li></ul><p>The better question is no longer &#8220;which model is best?&#8221;</p><p>It is: <strong>which AI system owns the path from intent to completed work; with the integrations, controls, and deployment muscle to make it real?</strong></p><div><hr></div><h1>&#128204; Your to-do list</h1><ul><li><p><strong>Map where your AI projects still depend on manual deployment work.</strong> If the hard part is integration, governance, training, or adoption, compare your roadmap against vendor-led deployment options.</p></li><li><p><strong>Review your vertical workflows against packaged AI offerings.</strong> Legal, finance, SMB ops, HR, sales, and accounting are becoming vendor-shaped categories fast.</p></li><li><p><strong>Re-benchmark your coding-agent setup.</strong> Track task completion, retries, approvals, wall-clock time, and effective cost under current Codex and Claude Code limits.</p></li><li><p><strong>Audit pricing exposure for agentic workloads.</strong> Separate human chat usage from autonomous agent usage in your budget model. They do not scale the same way.</p></li><li><p><strong>Make your Android surfaces assistant-readable.</strong> Review intents, permissions, state handling, checkout flows, and confirmation steps before Gemini-style automation becomes the default UX.</p></li><li><p><strong>Write an AI security-agent access policy.</strong> Define repo access, sandboxing, logging, patch authority, human review, and incident response before tools like Daybreak become procurement conversations.</p></li><li><p><strong>Build security evals for triage, not just code generation.</strong> Measure whether AI security tools correctly prioritize exploitability, blast radius, false positives, and patch quality.</p></li></ul><div><hr></div><p><em>See you next week.</em></p>]]></content:encoded></item><item><title><![CDATA[Weekly Dose #1 - AI’s Next Battlefield Isn’t Models. It’s Systems]]></title><description><![CDATA[This week: finance agents became enterprise products, ML supply-chain attacks escalated, OpenAI upgraded ChatGPT&#8217;s default model, Anthropic bought massive new compute capacity, and DeepSeek proved &#8220;cheap and capable&#8221; is becoming strategically dangerous.]]></description><link>https://mlpills.substack.com/p/weekly-dose-1-ais-next-battlefield</link><guid isPermaLink="false">https://mlpills.substack.com/p/weekly-dose-1-ais-next-battlefield</guid><dc:creator><![CDATA[David Andrés]]></dc:creator><pubDate>Fri, 08 May 2026 06:02:18 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/e4d3e695-234e-4369-89da-0aa789351382_1873x840.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1>&#128240;The Weekly Dose</h1><p><em>Welcome to the Weekly Dose: your 5-minute breakdown of the AI/ML news that changed how us builders should think this week. </em></p><p><em>This <strong>first edition</strong> covers <strong>30 April to 7 May 2026</strong>. No stale benchmark victory laps. No &#8220;this might be big someday&#8221; filler. Just the five stories that affect how you build, deploy, secure, or buy AI systems.</em></p><p><em>This week: finance agents became enterprise products, ML supply-chain attacks escalated, OpenAI upgraded ChatGPT&#8217;s default model, Anthropic bought massive new compute capacity, and DeepSeek proved &#8220;cheap and capable&#8221; is becoming strategically dangerous.</em></p><div class="callout-block" data-callout="true"><h4 style="text-align: center;">Issue sponsored by <strong><a href="https://tracebloc.io?utm_source=newsletter&amp;utm_medium=email&amp;utm_campaign=mlpills">tracebloc</a></strong></h4><p>Most of us have a <strong>dataset we can&#8217;t share</strong>, a complex problem, and someone outside the team who could probably help. There&#8217;s no good way to bridge the two. Access takes months, if it happens at all.</p><p><strong><a href="https://tracebloc.io?utm_source=newsletter&amp;utm_medium=email&amp;utm_campaign=mlpills">tracebloc</a> </strong>is the tool for that. For <strong>making confidential data accessible</strong> to collaborators, universities, freelancers, startups, or consultants. For sharing complex problems, building and innovating together.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://tracebloc.io?utm_source=newsletter&amp;utm_medium=email&amp;utm_campaign=mlpills" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qXrZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c0b2faa-b244-4387-960c-7ca1f9db8b1c_1915x821.png 424w, https://substackcdn.com/image/fetch/$s_!qXrZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c0b2faa-b244-4387-960c-7ca1f9db8b1c_1915x821.png 848w, https://substackcdn.com/image/fetch/$s_!qXrZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c0b2faa-b244-4387-960c-7ca1f9db8b1c_1915x821.png 1272w, https://substackcdn.com/image/fetch/$s_!qXrZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c0b2faa-b244-4387-960c-7ca1f9db8b1c_1915x821.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qXrZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c0b2faa-b244-4387-960c-7ca1f9db8b1c_1915x821.png" width="1456" height="624" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c0b2faa-b244-4387-960c-7ca1f9db8b1c_1915x821.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:624,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2399836,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://tracebloc.io?utm_source=newsletter&amp;utm_medium=email&amp;utm_campaign=mlpills&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://mlpills.substack.com/i/196752504?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c0b2faa-b244-4387-960c-7ca1f9db8b1c_1915x821.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qXrZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c0b2faa-b244-4387-960c-7ca1f9db8b1c_1915x821.png 424w, https://substackcdn.com/image/fetch/$s_!qXrZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c0b2faa-b244-4387-960c-7ca1f9db8b1c_1915x821.png 848w, https://substackcdn.com/image/fetch/$s_!qXrZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c0b2faa-b244-4387-960c-7ca1f9db8b1c_1915x821.png 1272w, https://substackcdn.com/image/fetch/$s_!qXrZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c0b2faa-b244-4387-960c-7ca1f9db8b1c_1915x821.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>You set up your own ML workspace on your infra with one line of code. Invite people by email, they train and fine-tune models on your data inside containers. <strong>Data stays in your infra.</strong> You see a leaderboard with how each contributor performs on your problem.</p><p><strong>It&#8217;s free. </strong>Live in minutes.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://tracebloc.io?utm_source=newsletter&amp;utm_medium=email&amp;utm_campaign=mlpills&quot;,&quot;text&quot;:&quot;&#128279; See how it works&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://tracebloc.io?utm_source=newsletter&amp;utm_medium=email&amp;utm_campaign=mlpills"><span>&#128279; See how it works</span></a></p></div><div><hr></div><h2>1. OpenAI just made voice agents a serious engineering surface</h2><p>On <strong>7 May</strong>, OpenAI announced three new realtime audio models through its API. GPT&#8209;Realtime&#8209;2 is the first voice model with GPT&#8209;5&#8209;class reasoning, built to handle harder requests and carry conversations forward naturally. GPT&#8209;Realtime&#8209;Translate does live speech translation from 70+ input languages into 13 output languages while keeping pace with the speaker. GPT&#8209;Realtime&#8209;Whisper is a streaming transcription model that converts speech to text as the person is still talking. </p><p>The gap over the previous generation is measurable: GPT&#8209;Realtime&#8209;2 with high reasoning scored 96.6% on Big Bench Audio, compared to 81.4% for GPT&#8209;Realtime&#8209;1.5. On Audio MultiChallenge instruction following, the xhigh reasoning tier scored 48.5% versus 34.7% for the prior model. New developer-facing features include preambles (&#8221;let me check that&#8221; before a tool call), parallel tool calls mid-conversation, and a context window expanded from 32K to 128K tokens. On pricing, GPT&#8209;Realtime&#8209;2 is priced at $32/1M audio input tokens and $64/1M audio output tokens. GPT&#8209;Realtime&#8209;Translate runs at $0.034 per minute and GPT&#8209;Realtime&#8209;Whisper at $0.017 per minute. </p><div class="callout-block" data-callout="true"><p>&#129781; <strong>Why it matters to you:</strong> If your product has a voice layer (customer support, accessibility, field agents, meeting transcription) the capability bar just moved. Real GPT-5-class reasoning running natively in a voice model, with parallel tool calls and 128K context, is a different product category than what shipped twelve months ago. The &#8220;voice is too unreliable for production&#8221; objection is getting harder to make.</p><p>&#129323; <strong>The subtext nobody says out loud:</strong> Live translation across 70+ languages, priced per minute, is the end of "we'll add multilingual support later." It's also a quiet play for every call centre, clinic, and government service that currently employs human interpreters for routine interactions. OpenAI isn't just selling a voice API, it's repricing a labour category.</p></div><div class="twitter-embed" data-attrs="{&quot;url&quot;:&quot;https://x.com/OpenAI/status/2052438194625593804&quot;,&quot;full_text&quot;:&quot;Introducing GPT-Realtime-2 in the API: our most intelligent voice model yet, bringing GPT-5-class reasoning to voice agents.\n\nVoice agents are now real-time collaborators that can listen, reason, and solve complex problems as conversations unfold.\n\nNow available in the API &quot;,&quot;username&quot;:&quot;OpenAI&quot;,&quot;name&quot;:&quot;OpenAI&quot;,&quot;profile_image_url&quot;:&quot;https://pbs.substack.com/profile_images/1885410181409820672/ztsaR0JW_normal.jpg&quot;,&quot;date&quot;:&quot;2026-05-07T17:19:32.000Z&quot;,&quot;photos&quot;:[{&quot;img_url&quot;:&quot;https://substackcdn.com/image/upload/w_1028,c_limit,q_auto:best/l_twitter_play_button_rvaygk,w_88/fnn5qfvkfojtceysllfa&quot;,&quot;link_url&quot;:&quot;https://t.co/2DY1LU2vO8&quot;}],&quot;quoted_tweet&quot;:{},&quot;reply_count&quot;:387,&quot;retweet_count&quot;:599,&quot;like_count&quot;:7452,&quot;impression_count&quot;:756081,&quot;expanded_url&quot;:null,&quot;video_url&quot;:&quot;https://video.twimg.com/amplify_video/2052435360064643072/vid/avc1/1280x720/aMPSeXgmBHgLW3UZ.mp4&quot;,&quot;belowTheFold&quot;:true}" data-component-name="Twitter2ToDOM"></div><p></p><h2>2. Finance agents just became the new enterprise AI battleground</h2><p>On <strong>5 May</strong>, Anthropic released <strong>10 ready-to-run financial services agent templates</strong> covering pitchbook creation, KYC screening, earnings review, model building, market research, valuation, general-ledger reconciliation, month-end close, and statement auditing. They ship as Claude Cowork and Claude Code plugins, with cookbooks for Claude Managed Agents.</p><p>What matters isn&#8217;t just the agent count. It&#8217;s the full production stack around them: governed data connectors, credential vaults, permissions, audit logs, and human review checkpoints the boring pieces teams usually spend months building themselves. Claude now works across Excel, PowerPoint, and Word (Outlook coming soon), with new connectors for Dun &amp; Bradstreet, IBISWorld, Verisk, and a Moody&#8217;s MCP app covering 600M+ companies.</p><div id="youtube2-foxeK2AXfHQ" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;foxeK2AXfHQ&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/foxeK2AXfHQ?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>OpenAI moved on the same front, announcing a PwC collaboration on finance agents for planning, forecasting, reporting, and accounting close. Anthropic also announced a new enterprise AI services company with Blackstone, Hellman &amp; Friedman, and Goldman Sachs to help mid-sized businesses deploy Claude in real operations.</p><div class="callout-block" data-callout="true"><p><strong>&#129781; Why it matters to you:</strong> If your team is maintaining an internal RAG pipeline for KYC, invoice reconciliation, analyst research, or compliance work, the question is no longer just &#8220;can we build this?&#8221; It&#8217;s &#8220;are we out-engineering a vendor that already ships with the connectors, audit trail, model, and implementation team?&#8221;</p><p><strong>&#129323; The subtext nobody says out loud:</strong> The moat isn&#8217;t the model anymore. It&#8217;s distribution, connectors, workflow templates, and forward-deployed engineers. The AI industry&#8217;s answer to &#8220;agents don&#8217;t work in regulated industries&#8221; turns out to be more integrations and more humans around the model. Slightly less sci-fi. Much more useful.</p></div><div><hr></div><p style="text-align: center;"><em>Let us know what you think about this new section in the <strong>comments</strong>.</em></p><p style="text-align: center;"><em>Don&#8217;t forget to <strong>like</strong> and <strong>share</strong> this with your contacts. </em></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://mlpills.substack.com/p/weekly-dose-1-ais-next-battlefield/comments&quot;,&quot;text&quot;:&quot;Leave a comment&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://mlpills.substack.com/p/weekly-dose-1-ais-next-battlefield/comments"><span>Leave a comment</span></a></p><p style="text-align: center;"><em><strong>Thank you</strong> </em>&#128155;</p><div><hr></div><h2>3. Mini Shai-Hulud hit the ML supply chain</h2><p>On <strong>30 April</strong>, two malicious versions of the <code>lightning</code> PyPI package, <strong>2.6.2</strong> and <strong>2.6.3</strong>, were published with credential-stealing code. The critical detail: the payload runs on import, not just on install. The compromised versions shipped a hidden <code>_runtime</code> directory that downloaded the Bun JavaScript runtime and executed a roughly <strong>11 MB obfuscated credential stealer</strong>. The last clean release is <strong>2.6.1</strong>.</p><p>An AI security scanner flagged both versions <strong>18 minutes after publication</strong>. The same campaign hit <code>intercom-client@7.0.4</code> on npm that same day, using credentials from a compromised developer account. The payload targeted cloud keys, GitHub tokens, SSH keys, and environment variables via a <code>preinstall</code> hook.</p><div class="callout-block" data-callout="true"><p><strong>&#129781; Why it matters to you:</strong> ML dependencies are now premium targets. They sit next to cloud credentials, model weights, notebooks, experiment trackers, and CI/CD pipelines. A compromised training dependency can have a larger blast radius than a compromised web library because those environments are usually more privileged and far less locked down.</p><p><strong>&#129323; The subtext nobody says out loud:</strong> &#8220;Just pip install the thing&#8221; is now a security decision. The AI stack has inherited npm&#8217;s supply-chain problems except now the packages live next to AWS keys, Hugging Face tokens, and private model artifacts.</p><p><strong>&#128736;&#65039; Practical takeaways:</strong></p><ul><li><p><strong>Block and audit </strong><code>lightning==2.6.2</code><strong> and </strong><code>lightning==2.6.3</code><strong>.</strong> Pin to <code>2.6.1</code> until you&#8217;ve verified a clean later release.</p></li><li><p><strong>Treat any environment that imported those versions as compromised.</strong> Rotate cloud keys, GitHub PATs, npm tokens, and SSH keys.</p></li><li><p><strong>Audit lockfiles and CI logs.</strong> Look for <code>lightning</code> 2.6.2/2.6.3 and <code>intercom-client</code> 7.0.4 from 30 April onward.</p></li><li><p><strong>Review </strong><code>.github/workflows/</code><strong> files</strong> added or changed after 30 April.</p></li><li><p><strong>Move CI to short-lived OIDC tokens.</strong> Long-lived credentials are exactly what import-time payloads are hunting.</p></li><li><p><strong>Harden high-risk accounts.</strong> Passkeys, hardware keys, shorter sessions, and tighter recovery are basic hygiene for anyone with access to production AI systems.</p></li></ul></div><div><hr></div><div class="poll-embed" data-attrs="{&quot;id&quot;:508628}" data-component-name="PollToDOM"></div><div><hr></div><h2>4. GPT-5.5 Instant became ChatGPT&#8217;s new default</h2><p>OpenAI rolled out <strong>GPT-5.5 Instant</strong> as ChatGPT&#8217;s default model on <strong>5 May</strong>, replacing GPT-5.3 Instant for all users. The headline claim: <strong>52.5% fewer hallucinated claims</strong> on high-stakes prompts in medicine, law, and finance, and <strong>37.3% fewer inaccurate claims</strong> on conversations users had already flagged for errors. It also handles images better, answers STEM questions more reliably, and makes smarter decisions about when to use web search.</p><p>The same release introduced <strong>memory sources</strong>: visible context that shows users which saved memories or past conversations are shaping a response, with options to delete or correct them. GPT-5.5 Instant is also available in the API as <code>chat-latest</code>.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://openai.com/index/gpt-5-5-instant/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mA0-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11dfa528-ad75-4c69-820e-fead6bde4e70_1062x367.png 424w, https://substackcdn.com/image/fetch/$s_!mA0-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11dfa528-ad75-4c69-820e-fead6bde4e70_1062x367.png 848w, https://substackcdn.com/image/fetch/$s_!mA0-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11dfa528-ad75-4c69-820e-fead6bde4e70_1062x367.png 1272w, https://substackcdn.com/image/fetch/$s_!mA0-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11dfa528-ad75-4c69-820e-fead6bde4e70_1062x367.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mA0-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11dfa528-ad75-4c69-820e-fead6bde4e70_1062x367.png" width="619" height="213.9105461393597" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/11dfa528-ad75-4c69-820e-fead6bde4e70_1062x367.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:367,&quot;width&quot;:1062,&quot;resizeWidth&quot;:619,&quot;bytes&quot;:42188,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://openai.com/index/gpt-5-5-instant/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlpills.substack.com/i/196752504?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11dfa528-ad75-4c69-820e-fead6bde4e70_1062x367.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mA0-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11dfa528-ad75-4c69-820e-fead6bde4e70_1062x367.png 424w, https://substackcdn.com/image/fetch/$s_!mA0-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11dfa528-ad75-4c69-820e-fead6bde4e70_1062x367.png 848w, https://substackcdn.com/image/fetch/$s_!mA0-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11dfa528-ad75-4c69-820e-fead6bde4e70_1062x367.png 1272w, https://substackcdn.com/image/fetch/$s_!mA0-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11dfa528-ad75-4c69-820e-fead6bde4e70_1062x367.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div class="callout-block" data-callout="true"><p><strong>&#129781; Why it matters to you:</strong> The default model matters more than most benchmark launches. It shapes what non-experts use, what your coworkers paste into workflows, and what everyone considers &#8220;normal&#8221; AI quality. If you maintain internal assistants, support bots, or research prompts, re-test old failure cases hallucination workarounds and &#8220;always search first&#8221; hacks may no longer be necessary.</p><p><strong>&#129323; The subtext nobody says out loud:</strong> The frontier race is loud. The default-model race is where habits form. OpenAI doesn&#8217;t need users to know which model they&#8217;re on. It just needs the default to feel good enough that they stop shopping around.</p></div><div><hr></div><h2>5. Anthropic bought more Claude capacity from SpaceX</h2><p>On <strong>6 May</strong>, Anthropic announced a compute deal with SpaceX, securing access to all capacity at the <strong>Colossus 1</strong> data center, adding over <strong>300 MW</strong> and more than <strong>220,000 NVIDIA GPUs</strong>, available within the month.</p><p>Anthropic is using the headroom immediately: Claude Code&#8217;s five-hour rate limits are doubling for Pro, Max, Team, and Enterprise plans. Peak-hour throttling is gone for Pro and Max users. Claude Opus API rate limits are going up too.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://x.ai/news/anthropic-compute-partnership" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JdwO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1987b3ca-1f2a-4f32-be73-53201028b38a_917x618.png 424w, https://substackcdn.com/image/fetch/$s_!JdwO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1987b3ca-1f2a-4f32-be73-53201028b38a_917x618.png 848w, https://substackcdn.com/image/fetch/$s_!JdwO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1987b3ca-1f2a-4f32-be73-53201028b38a_917x618.png 1272w, https://substackcdn.com/image/fetch/$s_!JdwO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1987b3ca-1f2a-4f32-be73-53201028b38a_917x618.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JdwO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1987b3ca-1f2a-4f32-be73-53201028b38a_917x618.png" width="536" height="361.2300981461287" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1987b3ca-1f2a-4f32-be73-53201028b38a_917x618.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:618,&quot;width&quot;:917,&quot;resizeWidth&quot;:536,&quot;bytes&quot;:97988,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://x.ai/news/anthropic-compute-partnership&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlpills.substack.com/i/196752504?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1987b3ca-1f2a-4f32-be73-53201028b38a_917x618.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JdwO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1987b3ca-1f2a-4f32-be73-53201028b38a_917x618.png 424w, https://substackcdn.com/image/fetch/$s_!JdwO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1987b3ca-1f2a-4f32-be73-53201028b38a_917x618.png 848w, https://substackcdn.com/image/fetch/$s_!JdwO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1987b3ca-1f2a-4f32-be73-53201028b38a_917x618.png 1272w, https://substackcdn.com/image/fetch/$s_!JdwO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1987b3ca-1f2a-4f32-be73-53201028b38a_917x618.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="callout-block" data-callout="true"><p><strong>&#129781; Why it matters to you:</strong> If you use Claude Code or Opus for serious engineering or long-running agentic work, capacity is part of the product. Higher limits change whether a tool is &#8220;useful occasionally&#8221; or &#8220;viable as a daily workhorse.&#8221;</p><p><strong>&#129323; The subtext nobody says out loud:</strong> AI subscriptions are becoming compute allocation plans dressed as productivity tools. The next pricing war may not be $20 vs $30 per seat it may be about who gives your agents enough uninterrupted GPU time to actually finish the job.</p></div><div><hr></div><h2>6. DeepSeek got a reality check: strong, cheap, not frontier</h2><p>On <strong>1 May</strong>, NIST&#8217;s Center for AI Standards and Innovation published its evaluation of <strong>DeepSeek V4 Pro</strong>. The headline finding: DeepSeek V4 is the most capable PRC AI model CAISI has evaluated, but it lags leading U.S. frontier models by roughly <strong>8 months</strong>. CAISI also found that DeepSeek&#8217;s own reported benchmarks put V4 closer to Opus 4.6 and GPT-5.4, while CAISI&#8217;s independent evaluations place it nearer to GPT-5.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.nist.gov/news-events/news/2026/05/caisi-evaluation-deepseek-v4-pro" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aP3i!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d4183a8-a9d5-42a1-a446-043e2e1d3aa9_1400x928.png 424w, https://substackcdn.com/image/fetch/$s_!aP3i!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d4183a8-a9d5-42a1-a446-043e2e1d3aa9_1400x928.png 848w, https://substackcdn.com/image/fetch/$s_!aP3i!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d4183a8-a9d5-42a1-a446-043e2e1d3aa9_1400x928.png 1272w, https://substackcdn.com/image/fetch/$s_!aP3i!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d4183a8-a9d5-42a1-a446-043e2e1d3aa9_1400x928.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aP3i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d4183a8-a9d5-42a1-a446-043e2e1d3aa9_1400x928.png" width="655" height="434.1714285714286" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d4183a8-a9d5-42a1-a446-043e2e1d3aa9_1400x928.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:928,&quot;width&quot;:1400,&quot;resizeWidth&quot;:655,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Comparison of aggregate capabilities over time of the most capable publicly released U.S. and PRC models according to a suite of benchmarks covering five domains.&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://www.nist.gov/news-events/news/2026/05/caisi-evaluation-deepseek-v4-pro&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Comparison of aggregate capabilities over time of the most capable publicly released U.S. and PRC models according to a suite of benchmarks covering five domains." title="Comparison of aggregate capabilities over time of the most capable publicly released U.S. and PRC models according to a suite of benchmarks covering five domains." srcset="https://substackcdn.com/image/fetch/$s_!aP3i!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d4183a8-a9d5-42a1-a446-043e2e1d3aa9_1400x928.png 424w, https://substackcdn.com/image/fetch/$s_!aP3i!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d4183a8-a9d5-42a1-a446-043e2e1d3aa9_1400x928.png 848w, https://substackcdn.com/image/fetch/$s_!aP3i!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d4183a8-a9d5-42a1-a446-043e2e1d3aa9_1400x928.png 1272w, https://substackcdn.com/image/fetch/$s_!aP3i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d4183a8-a9d5-42a1-a446-043e2e1d3aa9_1400x928.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The more useful finding is cost. Compared with GPT-5.4 mini, DeepSeek V4 was cheaper on <strong>5 of 7</strong> comparable benchmarks ranging from <strong>53% less expensive to 41% more expensive</strong> depending on the task. Separately, DeepSeek is reportedly in advanced funding talks at around a <strong>$50B valuation</strong>, signalling that the market still sees real strategic value in capable open-weight models even when they trail the frontier.</p><div class="callout-block" data-callout="true"><p><strong>&#129781; Why it matters to you:</strong> Don&#8217;t blindly swap frontier APIs for open-weight models. Don&#8217;t ignore them either. The boring-but-correct move: build an eval set from your own prompts, measure quality, latency, refusal behaviour, tool use, and cost then route workloads based on results.</p><p><strong>&#129323; The subtext nobody says out loud:</strong> Open models don&#8217;t need to be best-in-class to pressure proprietary moats. They only need to be good enough for a large chunk of production workloads. The expensive frontier models hold the prestige tier. Cheaper open models eat the workhorse layer.</p></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://mlpills.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">To receive new posts and support my work, consider becoming a paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h1>&#128161; Our take</h1><p>Two of this week&#8217;s stories look unrelated but share the same logic.</p><p>Anthropic and OpenAI are selling finance agents complete with templates, connectors, governance, and implementation help. The Lightning compromise is attacking the same ecosystem from the other side: as AI infrastructure becomes more concentrated and more privileged, a single bad dependency can reach a lot of valuable systems very quickly.</p><p>The pattern is simple: <strong>higher leverage means faster wins and a faster blast radius.</strong></p><p>The key signals from this week:</p><ul><li><p><strong>Domain-specific agent templates are becoming the default shape of enterprise AI.</strong> Generic agent platforms are giving way to packaged workflows with connectors, audit logs, permissions, and human review steps built in.</p></li><li><p><strong>Supply-chain security is now core MLOps.</strong> ML packages are privileged infrastructure, not harmless notebook helpers.</p></li><li><p><strong>Default models matter more than launch hype.</strong> GPT-5.5 Instant shifting the ChatGPT baseline affects more daily users than any frontier benchmark post.</p></li><li><p><strong>Compute is still the product bottleneck.</strong> Anthropic&#8217;s SpaceX deal is a feature release powered by 220,000 GPUs.</p></li><li><p><strong>Open-weight models are a persistent cost-pressure machine.</strong> DeepSeek may not be frontier, but &#8220;cheap and good enough&#8221; is a very dangerous position to compete against.</p></li></ul><p>The big question to ask isn&#8217;t &#8220;which model is best?&#8221; It&#8217;s <strong>&#8220;which system gives us the best mix of quality, cost, control, security, and time to production?&#8221;</strong></p><p>That answer is getting more situational every week.</p><div><hr></div><h1>&#128204; Your to-do list</h1><ol><li><p><strong>Voice-agent build-vs-buy review</strong>. Add voice channels to your top internal workflows (support, sales, operations, compliance). Check if OpenAI&#8217;s new Realtime models + templates from partners now shortcut custom development.</p></li><li><p><strong>Audit your lockfiles now.</strong> Search for <code>lightning==2.6.2</code>, <code>lightning==2.6.3</code>, and <code>intercom-client@7.0.4</code>. Treat affected environments as compromised, not merely outdated.</p></li><li><p><strong>Move CI secrets to short-lived OIDC tokens.</strong> Long-lived cloud keys are exactly what import-time and install-time payloads are hunting.</p></li><li><p><strong>Harden high-risk AI accounts.</strong> Use passkeys or hardware keys where possible. Tighten recovery. Review active sessions especially for accounts with Codex, cloud, repo, or production access.</p></li><li><p><strong>Run a build-vs-buy review on your top three internal agent workflows.</strong> Focus on finance, compliance, reconciliation, research, procurement, and reporting. If a vendor now ships with the connectors you spent months building, you have a decision to make.</p></li><li><p><strong>Re-test your ChatGPT workflows on GPT-5.5 Instant.</strong> Old verbosity constraints, hallucination workarounds, and &#8220;always search first&#8221; patterns may no longer be needed.</p></li><li><p><strong>Benchmark DeepSeek V4 on real workloads, not vibes.</strong> Use your own eval set. Route easy or cost-sensitive tasks to cheaper models where they pass quality thresholds. Keep frontier models for high-stakes or high-ambiguity work.</p></li></ol><div><hr></div><p><em>See you next week.</em></p>]]></content:encoded></item></channel></rss>