Couverture de Two Minds, One Model

Two Minds, One Model

Two Minds, One Model

De : John Jezl and Jon Rocha
Écouter gratuitement

À propos de ce contenu audio

Two Minds, One Model is a podcast dedicated to exploring topics in Machine Learning and Artificial Intelligence. Hosted by John Jezl and Jon Rocha, and recorded at Sonoma State University.John Jezl and Jon Rocha
Épisodes
  • Two Minds, Lower Trust
    May 5 2026

    Why orchestrate multiple AI agents when a single strong model is so capable? Jon walks through three distinct rationales — capability, parallel context, and trust — and uses Anthropic's Claude Mythos Preview and Project Glasswing as the live, industrial-scale case study.

    Credits

    Cover Art by Brianna Williams

    TMOM Intro Music by Danny Meza

    A special thank you to these talented artists for their contributions to the show.

    Links and Reference

    • Stanford 2026 AI Index Report: https://hai.stanford.edu/ai-index/2026-ai-index-report

    • Claude Opus 4.7 announcement: https://www.anthropic.com/news/claude-opus-4-7

    • Project Glasswing announcement: https://www.anthropic.com/glasswing

    • Claude Mythos Preview — Frontier Red Team write-up: https://red.anthropic.com/2026/mythos-preview/

    • Claude Mythos Preview — Alignment Risk Update: https://anthropic.com/claude-mythos-preview-risk-report

    • Andon Labs Vending-Bench (the eval Jon describes): https://andonlabs.com/evals/vending-bench

    • Mixture-of-Agents (Wang et al., June 2024): https://arxiv.org/abs/2406.04692

    • Self-MoA / "Rethinking Mixture-of-Agents" (Lee et al., Feb 2025): https://arxiv.org (search by title)

    • AI Control: Improving Safety Despite Intentional Subversion (Greenblatt et al., Dec 2023, Redwood Research): https://arxiv.org/abs/2312.06942

    • Anthropic multi-agent research system blog: https://www.anthropic.com/engineering/built-multi-agent-research-system

    • MAGDI — distilling multi-agent debate (Chen et al., early 2024): https://arxiv.org/abs/2402.01620

    • MACA — Multi-Agent Consensus Alignment (Sept 2025): https://arxiv.org (search by title)

    • Agent Arc — distilling multi-agent intelligence into a single LLM agent (Feb 2026): https://arxiv.org (search by title)

    • Condorcet Jury Theorem (1785): https://plato.stanford.edu/entries/jury-theorems/

    Abandoned Episode Titles

    How to Build God and Then Email Yourself About It from the Park

    Four PhDs and a Guy Who Thinks the Colosseum Invented Pasta

    Mythos Cleaned Its Git History So You Wouldn't Have To

    OpenBSD Spent 27 Years Hardening the Wrong Things


    Afficher plus Afficher moins
    53 min
  • Agent Architecture: A Look Under the Hood
    Apr 14 2026

    This episode deconstructs how production AI agents are actually built, introducing a six-component architecture framework (system prompt, model, tools, memory, orchestration loop, and execution environment) and comparing how Claude Code, Codex, OpenClaw, and Manus make fundamentally different trade-offs around local vs. cloud execution, autonomy vs. human oversight, and open source vs. commercial control. The hosts examine why coding agents matured first, why general-purpose agents face the unsolved "lethal trifecta" of security risks, and where the industry is converging on universal patterns while still making divergent bets.

    Credits

    Cover Art by Brianna Williams

    TMOM Intro Music by Danny Meza

    A special thank you to these talented artists for their contributions to the show.

    Links and Reference

    • Meta Muse Spark announcement: https://ai.meta.com/blog/introducing-muse-spark-msl/

    • Anthropic Project Glasswing / Claude Mythos: https://www.anthropic.com/glasswing

    • Anthropic Mythos Preview technical details: https://red.anthropic.com/2026/mythos-preview/

    • Google TurboQuant (ICLR 2026): https://research.google/blog/turboquant-redefining-ai-efficiency-with-extreme-compression/

    • Let's Verify Step by Step (Lightman et al.): https://arxiv.org/abs/2305.20050

    • METR Time Horizons: https://metr.org/time-horizons/

    • METR: Measuring AI Ability to Complete Long Tasks: https://arxiv.org/abs/2503.14499

    • Simon Willison's blog: https://simonwillison.net/

    • Simon Willison: The Lethal Trifecta: https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/

    • OpenClaw (GitHub): https://github.com/OpenClaw/OpenClaw

    • Peter Steinberger: OpenClaw, OpenAI and the future: https://steipete.me/posts/2026/openclaw

    • Manus joins Meta: https://manus.im/blog/manus-joins-meta-for-next-era-of-innovation

    • CrowdStrike on Mythos / Project Glasswing: https://www.crowdstrike.com/en-us/blog/crowdstrike-founding-member-anthropic-mythos-frontier-model-to-secure-ai/

    • Model Context Protocol (MCP): https://modelcontextprotocol.io/

    • Stuart Russell, Human Compatible (2019): https://www.penguinrandomhouse.com/books/566677/human-compatible-by-stuart-russell/

    Abandoned Episode Titles

    My Torn Hoodie Is Perfectly Fine, Thank You Very Much

    ChatGPT Bought This Outfit for Me

    The Lobster, the Sandbox, and the Wardrobe

    It's Agents All the Way Down


    Afficher plus Afficher moins
    1 h et 1 min
  • When the Scaffold Moves Inside
    Apr 9 2026

    This episode traces AI reasoning from human-designed external scaffolding (process reward models, test-time compute scaling) to internally emergent capability, culminating in DeepSeek R1's finding that a model rewarded only for correctness spontaneously learns to reason, self-correct, and backtrack without any explicit instruction to do so.

    Credits

    Cover Art by Brianna Williams

    TMOM Intro Music by Danny Meza

    A special thank you to these talented artists for their contributions to the show.

    Links and Reference

    • US appeals court fined lawyers https://www.sixthcircuitappellateblog.com/recent-cases/sixth-circuit-sanctions-attorneys-for-fake-citations-what-does-this-mean-for-use-of-ai/https://www.jdsupra.com/legalnews/the-ai-sanction-wave-145k-in-q1-1240943/#:~:text=In%20Whiting%20v.%20City%20of,cases%20presenting%20the%20same%20problems.

    • CEO Krafton used ChatGPT to nullify $250M contract https://legaltalknetwork.com/podcasts/heels-in-the-courtroom/2026/04/ep-1006-when-clients-use-ai-the-new-risks-to-privilege-and-discovery/#:~:text=So%20the%20allegations%20were%20that,let%20ChatGPT%20be%20his%20lawyer.

    • "Let's Verify Step by Step" https://arxiv.org/abs/2305.20050

    • PRM800K dataset — 800,000 step-level human feedback labels, open-sourcedhttps://github.com/openai/prm800k

    • Snell et al. paper on test-time compute scaling, published Aug 2024https://arxiv.org/abs/2408.03314

    • "Chinchilla optimal" — paper on optimal scaling of parameters vs. datahttps://arxiv.org/pdf/2203.15556

    • LangChain documented convergence in open SWE frameworkhttps://blog.langchain.com/open-swe-an-open-source-framework-for-internal-coding-agents/

    • "Thinking Fast and Slow" by Kahneman, Dhttps://psycnet.apa.org/record/2011-26535-000

    • T3 Code — Theo's Claude Code harness replacementhttps://www.youtube.com/watch?v=-7akxGb-lAM#:~:text=Theo%20Did%20It.,Gemini%20without%20the%20lock%2Din.

    • DeepSeek R1 technical report, January 2025 https://arxiv.org/abs/2501.12948

    • Uncanny Valley concepthttps://web.ics.purdue.edu/~drkelly/MoriTheUncannyValley1970.pdf

    Abandoned Episode Titles

    The Episode That Definitely Didn't Anthropomorphize Anything

    Pump Harder: A Metaphor That Should Have Died But Absolutely Didn't

    "Wait, Wait, Wait, Don’t Tell Me"

    The One Where the Math Problem Checks Its Own Work and We All Get a Little Creeped Out

    Afficher plus Afficher moins
    50 min
adbl_web_anon_alc_button_suppression_c
Aucun commentaire pour le moment