Why validity beats scale when building multi‑step AI systems
Impossible d'ajouter des articles
Désolé, nous ne sommes pas en mesure d'ajouter l'article car votre panier est déjà plein.
Veuillez réessayer plus tard
Veuillez réessayer plus tard
Échec de l’élimination de la liste d'envies.
Veuillez réessayer plus tard
Impossible de suivre le podcast
Impossible de ne plus suivre le podcast
-
Lu par :
-
De :
À propos de ce contenu audio
In this episode, Dr. Sebastian (Seb) Benthall joins us to discuss research from his and Andrew's paper entitled “Validity Is What You Need” for agentic AI that actually works in the real world.
Our discussion connects systems engineering, mechanism design, and requirements to multi‑step AI that creates enterprise impact to achieve measurable outcomes.
- Defining agentic AI beyond LLM hype
- Limits of scale and the need for multi‑step control
- Tool use, compounding errors, and guardrails
- Systems engineering patterns for AI reliability
- Principal–agent framing for governance
- Mechanism design for multi‑stakeholder alignment
- Requirements engineering as the crux of validity
- Hybrid stacks: LLM interface, deterministic solvers
- Regression testing through model swaps and drift
- Moving from universal copilots to fit‑for‑purpose agents
You can also catch more of Seb's research on our podcast. Tune in to Contextual integrity and differential privacy: Theory versus application.
What did you think? Let us know.
Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics:
- LinkedIn - Episode summaries, shares of cited articles, and more.
- YouTube - Was it something that we said? Good. Share your favorite quotes.
- Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.
Vous êtes membre Amazon Prime ?
Bénéficiez automatiquement de 2 livres audio offerts.Bonne écoute !
Aucun commentaire pour le moment