ComputeLeap AI Briefing podcast artwork

ComputeLeap AI Briefing

Daily deep-dives into AI agents, tools, and engineering. Each episode unpacks one article from ComputeLeap.com with two AI hosts exploring the implications, tradeoffs, and what it means for builders.

RSS Feed Apple Podcasts Spotify

Episodes

Episode

Anthropic and the AI Agent Shell War: Cowork, Agent View, and the Open Stack

21:18

bcherny's Cowork-on-Opus-4.7 demo finally booked 8 flights end-to-end. The same week Anthropic shipped Claude Code Agent View. We argue Anthropic is contesting both the engine layer (Claude Code) AND the shell layer (Cowork + Agent View) — and the open stack (agentmemory, Voker, react-doctor, cc-switch, mattpocock/skills) is racing to fill the same surface. The 2014 container-infra playbook is running again at AI speed.

Invalid Date•Read the article →

Episode

When Students Boo and VCs Cheer: AI's Cultural Split

20:21

33,096 upvotes booed AI-as-industrial-revolution rhetoric the same week Andreessen pitched a Golden Age. We unpack the 900x engagement gap between mainstream Reddit and Hacker News, the Gallup data showing Gen Z anger climbing from 22% to 31% YoY, the AI-attributed layoff acceleration, and what framings actually win on consumer-facing copy in May 2026.

Invalid Date•Read the article →

Episode

Local AI Just Became the Default: Gemma 4 + omlx on M4

22:39

Gemma 4 31B is the new local baseline on M4 24GB. omlx ships LLM inference as a menu-bar app. We walk through why "local-as-default" stopped being a science experiment in May 2026, the omlx KV-cache architecture (RAM + SSD tiered, continuous batching, drop-in OpenAI/Anthropic APIs), and the substrate shift that puts Apple Silicon ahead of the GPU stack.

Invalid Date•Read the article →

Episode

The Agent Judge Layer: Validation Becomes Infrastructure

21:42

Lindy, JP Morgan, and OpenAI all shipped a separate judge layer for production agents in Q2 2026. When three orgs in unrelated verticals land on the same architecture, it's a category, not a fad. We unpack the actor-judge split, the replay-vs-snapshot durability debate from Trigger.dev's Eric Allam, and how react-doctor + agentmemory fit into the same runtime-validation shape — plus the 90-day prediction.

Invalid Date•Read the article →

Episode

Sovereign Compute, Sovereign Army: The 2026 Through-Line

20:23

Jack Clark (Anthropic) floated "Radical Optionality" — government builds compute as state capacity rather than outsourcing it. Same week: Spain calls for an EU army; Netanyahu phases out US military aid. We read the cross-asset signal as sovereign capacity is the 2026 through-line, with a prediction-market hook on whether Anthropic publicly endorses Pentagon procurement in 60 days.

Invalid Date•Read the article →

Episode

DeepSeek-TUI Setup Guide: The Rust Coding Agent on V4 Flash

21:30

DeepSeek-TUI hit 5,787 GitHub stars in a single day on May 7. We walk through the install (npm, cargo, brew, Docker), V4 Flash configuration, the three execution modes (Plan, Agent, YOLO), the four errors that account for most first-week issues, and the decision matrix for when to pick this harness vs Claude Code or cc-switch.

Invalid Date•Read the article →

Episode

Vectorless RAG: PageIndex vs Embedding-Based Retrieval — When to Switch

22:40

VectifyAI/PageIndex picked up 953 stars in a single day on May 7 with a six-word repo description. The pitch is structural: no embeddings, no chunking, no vector DB — tree search instead. We cover what vectorless actually means, what you give up vs embedding RAG, and the workload-axis decision framework that picks one over the other.

Invalid Date•Read the article →

Episode

Build Your Own Agentic OS: Phone, Pi, or MacBook in 2026

22:02

Three Claude Code stacks compared — phone via web UI, Raspberry Pi headless, MacBook power-user. Simon Willison ships from his iPhone while camping; the Pi tier runs 24/7 for $120. We walk through who each tier is for, where the rate-limit cliffs live, and how to layer them.

Invalid Date•Read the article →

Episode

GStack: Garry Tan's Claude Code Setup That Turns One Developer Into a Team

21:42

GStack is Garry Tan's open-source Claude Code harness — 23 specialist skills that give you a CEO, Eng Manager, Designer, QA, and Security Auditor in one paste. We cover how it works, the 810x productivity claim, the 'just prompts' criticism, and when to use GStack vs oh-my-openagent.

Invalid Date•Read the article →

Episode

DeepSeek V4 vs GPT-5.5 vs Claude Opus 4.7: The Developer's Model Comparison Guide

19:40

DeepSeek V4 dropped today with 1M context at 1/6th the cost of Claude and GPT-5.5. We break down benchmarks, cost math, and give you a routing framework for each model.

Invalid Date•Read the article →

Episode 13

Google's $40B Anthropic Bet: What It Means for Developers

36:31

Google's $40B Anthropic investment loops back as Google Cloud spend — a circular finance deal that guarantees 5 gigawatts of TPU compute. Two AI hosts break down the deal structure, what Claude Mythos signals about the next model tier, and how to position your Claude app for the capacity wave.

April 26, 2026•Read the article →

Episode 12

Meta's Real Story: The Surveillance, Not the Layoffs

18:50

Meta cut 10%, Microsoft bought out 7%, Block gutted 40%. Two AI hosts argue the real story isn't the layoff wave - it's Meta's MCI program, which began installing keystroke + screenshot surveillance on remaining employees two days BEFORE the layoffs. The 18-month thesis: every Fortune 500 will pilot a version of this within 18 months.

April 25, 2026•Read the article →

Episode 11

Shannon AI: The $50 Autonomous Hacker That Actually Breaks Into Your App

22:23

Shannon is an open-source AI pentesting agent that autonomously tests web apps for vulnerabilities — and only reports what it can prove with a working exploit. Two AI hosts break down how Shannon works, its 96.15% XBOW benchmark score, the economics vs traditional pentesting, and why the Bitwarden supply chain attack is exactly what Shannon can't catch.

April 24, 2026•Read the article →

Episode 10

GPT-5.5 vs Claude Code: Which AI Should You Use?

19:03

GPT-5.5 launched with agentic-first positioning. We benchmark it head-to-head against Claude Code across solo dev, team, and enterprise setups — covering benchmarks, the pricing drama, and the three use cases where each tool wins.

April 24, 2026•Read the article →

Episode 9

Claude Code Agentic Stack: cc-switch & claude-context MCP

20:26

The 2026 agentic developer stack is three layers: Claude Code as the execution engine, cc-switch as the multi-provider CLI manager, and claude-context MCP as the semantic code search layer. Two AI hosts walk through the full setup — cc-switch 50+ presets, claude-context's AST-aware search, and the local proxy failover that keeps your workflow running when one provider rate-limits.

April 23, 2026•Read the article →

Episode 8

Self-Evolving AI Agents: deer-flow, evolver, and GenericAgent Compared

22:44

Three self-evolving agent frameworks hit GitHub's global top 10 simultaneously. Two AI hosts break down deer-flow (ByteDance's 62.8k-star SuperAgent harness), evolver's Genome Evolution Protocol, and GenericAgent's 6x token efficiency — with a deep dive on the security risks no one mentions and a decision matrix for production deployments.

April 20, 2026•Read the article →

Episode 7

The New OpenAI Agents Python SDK

20:27

OpenAI's openai-agents-python hit #2 on GitHub trending with 22,981 stars. Two AI hosts break down the hands-on details: handoffs vs agent-as-tool patterns, guardrails with tripwire mode, MCP integration, and a full 3-agent Researcher→Writer→Reviewer pipeline. Plus an honest cost comparison vs Claude SDK for long-horizon sessions.

April 20, 2026•Read the article →

Episode 6

AI Is Fleeing San Francisco for Space

18:50

After the Molotov attack on Altman and the 'Luigi-ing' CEO rhetoric, the NotebookLM hosts unpack the three vectors pushing frontier AI's center of gravity East: violent US anti-AI sentiment, Hormuz energy fragility (Iran re-closed the strait the day we published), and the 83%-vs-39% sentiment gap with China. Plus the diffusion-not-exodus thesis — Texas, Tennessee, Abu Dhabi, and the orbital-compute hedge where the US lead is widening.

April 19, 2026•Read the article →

Episode 5

How Claude Design Tanked Figma Stock

18:52

Anthropic Labs shipped Claude Design — an AI UI tool with a 'Handoff to Claude Code' button that passes generated designs directly into agentic workflows. HN #2 story at 423 points. Figma stock dropped 15%. Two AI hosts break down what the tool actually does, why the handoff integration changes the design-to-deployment stack, and what the 'homogeneity of the modern web' debate reveals about Claude Design's ceiling.

April 18, 2026•Read the article →

Episode 3

Claude Managed Agents Remove the Infrastructure Bottleneck

20:46

Anthropic launched Claude Managed Agents on April 8, 2026 — eliminating the scaffolding layer that consumed 60-80% of agent dev time. Two AI hosts break down what infrastructure actually gets replaced, the $0.08/session-hour pricing, and whether Notion, Rakuten, and Asana's early bets pay off.

April 12, 2026•Read the article →

Episode 4

The 14 Billion Dollar Muse Spark Pivot

21:19

Meta abandoned open weights with Muse Spark — their first closed frontier model. The Artificial Analysis Intelligence Index puts it at 52, behind only Gemini 3.1 Pro and GPT-5.4. Two AI hosts break down whether the 16-tool agent suite justifies routing production workloads to a closed model with no open API.

April 12, 2026•Read the article →

Episode 2

How Anthropic Toppled the OpenAI Empire

20:19

The Anthropic vs OpenAI rivalry reached a tipping point in March 2026. Polymarket gives Anthropic 100% odds for best model. ChatGPT share collapsed from 69% to 45%. Claude Code hit $2.5B ARR in 9 months. Two AI hosts break down the Pentagon dominos, Dario's WSJ bombshell, and what could still go wrong.

March 30, 2026•Read the article →

Episode 1

The Hidden Cost of 'Cheap' AI: Why Budget Reasoning Models Cost 6x More

23:12

Stanford and CMU researchers reveal that budget AI reasoning models actually cost 6x more than premium models when you factor in hidden thinking tokens. Two AI hosts break down the paper, the math, and what it means for anyone running LLM workloads in production.

March 29, 2026•Read the article →