Épisodes

  • David Aronchick on Distributed Data Orchestration with Expanso
    Jun 15 2026
    In this episode of Alexa's Input (AI), I sit down with David Aronchick, co-founder and CEO of Expanso and former product lead for Kubernetes at Google.Data is growing everywhere outside your data center. Solar panels in remote across a country. Security cameras at retail stores. IoT sensors across factory floors. And moving that data to the cloud for processing? It's expensive, slow, and often restricted by compliance.David is an expert when it comes to solving distribution problems. He led Kubernetes product at Google, co-founded Kubeflow to bring ML to production, and now he's building Expanso to tackle a difficult constraint: when your data can't move, how do you process it where it lives?We discuss:- The need for distributed data orchestration-Upstream data control: filtering and transforming at the source- Three forces making edge computing inevitable (physics, regulations, economics)- How to build successful open source infrastructure projects- Customer discovery and finding real pain points- His transition from Protocol Labs to founding Expanso- ETL pipelines: moving the first four steps closer to the data- Context loss and lineage in distributed systems- Processing 400,000 signals per second with 150MB agents- AI observability: attaching source metadata to training data- Running ML pipelines at the edge- Real-world deployment challenges (bandwidth, regulations, cost)Expanso is rethinking how we process data in an AI-native world—moving compute to data instead of data to compute. If you want to understand where distributed systems and edge computing are heading, this is a deep dive into the infrastructure layer beneath modern AI applications.General Podcast LinksWatch: https://www.youtube.com/@alexa_griffith Read: https://alexasinput.substack.com/ Listen: https://creators.spotify.com/pod/profile/alexagriffith/ More: https://linktr.ee/alexagriffithLearn more about the host atWebsite: https://alexagriffith.com/ LinkedIn: https://www.linkedin.com/in/alexa-griffith/Find out more about the guest atLinkedIn: https://www.linkedin.com/in/aronchick/ Twitter/X: https://x.com/aronchick GitHub: https://github.com/aronchick Expanso Website: https://expanso.io/ResourcesExpanso Website: https://expanso.io/ Kubernetes: https://kubernetes.io/ Kubeflow: https://www.kubeflow.org/ CNCF (Cloud Native Computing Foundation): https://www.cncf.io/ Protocol Labs: https://protocol.ai/KeywordsDavid Aronchick, Expanso, Kubernetes, Kubeflow, distributed systems, edge computing, data pipelines, ETL, upstream data control, Google Kubernetes Engine, open source, CNCF, observability, log processing, data lineage, provenance, schema enforcement, IoT, edge AI, distributed data, machine learning infrastructure, Protocol Labs, IPFS, Filecoin, data governance, compliance, GDPR, bandwidth optimization, data aggregation, AI infrastructure, multi-cloud, hybrid cloud, real-time processing
    Afficher plus Afficher moins
    1 h et 18 min
  • How vLLM and llm-d Changed AI Inference with Rob Shaw
    Jun 3 2026
    In this episode of Alexa’s Input (AI), I sat down with Rob Shaw from Red Hat to talk about how AI inference evolved from a simple model serving problem into a large-scale distributed systems problem.We explored the infrastructure shifts behind modern LLM serving, including how vLLM and PagedAttention changed the economics and efficiency of inference, why KV cache management became one of the most important bottlenecks in production AI systems, and how orchestration layers like llm-d are emerging to coordinate distributed inference.We also discuss:how LLM inference differs from traditional model serving runtimesKV cache, prefix caching, and cache-aware routingwhy throughput and latency became major infrastructure challengeslong-context agents and repeated inference callsdistributed inference on Kubernetesintelligent routing, flow control, and load balancingprefill/decode disaggregationenterprise AI deployment realitiesvLLM has become one of the most important open-source projects in AI infrastructure, and llm-d represents a newer shift toward treating inference as a coordinated distributed system rather than just a single runtime problem.If you want to better understand the systems layer beneath modern AI applications, this episode is a deep dive into where inference infrastructure is heading next.General Podcast LinksWatch: ⁠⁠⁠⁠⁠⁠https://www.youtube.com/@alexa_griffith⁠⁠⁠⁠⁠⁠Read: ⁠⁠⁠⁠⁠⁠⁠⁠https://alexasinput.substack.com/⁠⁠⁠⁠⁠⁠⁠⁠Listen:⁠⁠ ⁠⁠https://creators.spotify.com/pod/profile/alexagriffith/⁠⁠⁠⁠More: ⁠⁠⁠⁠⁠⁠https://linktr.ee/alexagriffith⁠⁠⁠⁠⁠⁠Learn more about the host atWebsite: ⁠⁠⁠⁠⁠⁠https://alexagriffith.com/⁠⁠⁠⁠⁠⁠LinkedIn: ⁠⁠⁠⁠⁠⁠https://www.linkedin.com/in/alexa-griffith/⁠⁠⁠⁠⁠⁠Find out more about the guest at:LinkedIn: https://www.linkedin.com/in/robert-shaw-1a01399a/ Red Hat Articles: https://developers.redhat.com/author/robert-shawGithub: https://github.com/robertgshaw2-redhat ResourcesvLLM Website: https://vllm.ai/vLLM GitHub Repository: https://github.com/vllm-project/vllmllm-d Website: https://llm-d.ai/llm-d GitHub Repository - https://github.com/llm-d/llm-d KeywordsAI inference, VLLM, LMD, distributed inference, GPU optimization, open source AI, Kubernetes, multi-cluster deployment, AI infrastructure, enterprise AI AI infrastructure, Kubernetes, model optimization, speculative decoding, mixture of experts, AI deployment, performance tuning, AI systems, neural network scaling Key TopicsEvolution of vLLM and llm-dDistributed inference and routingGPU utilization and performance optimizationOpen source AI infrastructureEnterprise deployment challenges and solutions Standardization in Kubernetes for NIC exposurePerformance optimizations: quantization and speculative decodingMixture of experts architecture and parallelism strategiesFlow control and request scheduling in AI systemsEmerging hardware for AI inference, Cerebras processorReinforcement learning and AI system supportModular architecture of vLLM and ecosystem projects
    Afficher plus Afficher moins
    1 h et 43 min
  • Intelligence Per Watt with Emilio Andere
    May 24 2026

    On this episode of Alexa’s Input (AI), I sit down with Emilio Andere, co-founder and CEO of Wafer, to talk about the future of AI infrastructure, inference optimization, and the economics driving the AI compute race.

    We discuss:

    • why “intelligence per watt” may become one of the defining metrics of the AI era
    • the current GPU and accelerator landscape across NVIDIA, AMD, TPUs, and emerging hardware startups
    • why software optimization is becoming just as important as hardware itself
    • inference optimization strategies
    • why AI infrastructure companies are racing up the stack
    • what it’s actually like building an AI infrastructure startup today

    and more!

    Emilio also shares lessons from founding Wafer, thoughts on the future of open-source AI infrastructure, and why he believes optimizing intelligence itself could become one of the most important engineering problems.


    General Podcast Links

    Watch: ⁠⁠⁠⁠⁠⁠https://www.youtube.com/@alexa_griffith⁠⁠⁠⁠⁠⁠

    Read: ⁠⁠⁠⁠⁠⁠⁠⁠https://alexasinput.substack.com/⁠⁠⁠⁠⁠⁠⁠⁠

    Listen:⁠⁠ ⁠⁠https://creators.spotify.com/pod/profile/alexagriffith/⁠⁠⁠⁠

    More: ⁠⁠⁠⁠⁠⁠https://linktr.ee/alexagriffith⁠⁠⁠⁠⁠⁠


    Learn more about the host at

    Website: ⁠⁠⁠⁠⁠⁠https://alexagriffith.com/⁠⁠⁠⁠⁠⁠

    LinkedIn: ⁠⁠⁠⁠⁠⁠https://www.linkedin.com/in/alexa-griffith/⁠⁠⁠⁠⁠⁠


    Find out more about the guest at:

    LinkedIn: https://www.linkedin.com/in/emi-andere/

    Wafer Website: https://www.wafer.ai/

    Wafer AI / Y Combinator Article: https://www.ycombinator.com/companies/wafer


    Chapters

    00:00 Exploring AI Conversations and Recent Podcasts

    02:14 Intelligence per Watt: A New Metric for AI

    07:35 The Manifesto: Efficiency in Civilization

    12:40 Founding Wafer: The Journey Begins

    18:08 The GPU Hardware Landscape and Market Dynamics

    23:07 AMD's Growing Presence in the GPU Market

    24:07 Emerging Competitors in the AI Hardware Space

    26:04 Comparing TPUs and GPUs

    27:21 Acquisition and Availability of TPUs

    28:33 Navigating the GPU Marketplace

    30:05 Understanding Neo Cloud Economics

    33:30 The AI Bubble Debate

    36:25 Optimizing AI Models for Performance

    44:46 Bottlenecks in AI Model Performance

    48:08 Future Directions in AI Hardware Optimization

    54:39 Balancing Speed and Cost in AI Performance

    56:54 Kernel Arena: Benchmarking AI Performance

    01:03:45 Lessons from Founding: Sales and Emotional Resilience

    01:07:38 The Future of AI: Trends and Predictions

    01:13:03 Outro


    Keywords

    AI hardware, inference optimization, intelligence per watt, GPU market, AI infrastructure, Wafer, AI bubble, TPU, GPU bottleneck, AI efficiency AI optimization, large language models, AI hardware, quantization, speculative decoding, benchmarking, AI infrastructure, model training, AI startups





    Afficher plus Afficher moins
    1 h et 14 min
  • Building Reliable Systems at Bloomberg with Sal Furino
    May 17 2026
    In this episode of Alexa’s Input (AI), I sit down with Sal Furino to explore the hidden engineering work that keeps modern systems reliable.We break down what Service Level Objectives, Indicators (SLOs/SLIs), and error budgets actually mean in practice, why reliability is as much a cultural problem as a technical one, and how teams can better measure real user experience instead of just infrastructure health.Sal also explains reliability engineering and the challenges of reliability at scale, like:Why latency and correctness become harder to measure with GenAIThe difference between a bad incident and a fundamentally bad systemHow observability and telemetry shape modern engineering organizationsWhy most teams focus too much on infrastructure metrics and not enough on user happiness Why “the best systems are the ones nobody notices.”If you work in AI infrastructure, distributed systems, platform engineering, observability, or SRE, this episode is a must listen!SRECon Talk Dashboards & Dragons: Reliability Magic for AI Platforms by Alexa Griffith and Sal Furino: https://youtu.be/aWMB_7ksbkc?si=S49nPyAl_hCUIH7yGeneral Podcast LinksWatch: ⁠⁠⁠⁠⁠https://www.youtube.com/@alexa_griffith⁠⁠⁠⁠⁠Read: ⁠⁠⁠⁠⁠⁠⁠https://alexasinput.substack.com/⁠⁠⁠⁠⁠⁠⁠Listen:⁠⁠ ⁠https://creators.spotify.com/pod/profile/alexagriffith/⁠⁠⁠More: ⁠⁠⁠⁠⁠https://linktr.ee/alexagriffith⁠⁠⁠⁠⁠Learn more about the host atWebsite: ⁠⁠⁠⁠⁠https://alexagriffith.com/⁠⁠⁠⁠⁠LinkedIn: ⁠⁠⁠⁠⁠https://www.linkedin.com/in/alexa-griffith/⁠⁠⁠⁠⁠Find out more about the guest at:LinkedIn: https://www.linkedin.com/in/salvatore-furino/Rootly Interview: https://rootly.com/humans-of-reliability/salvatore-furinoReliability at Scale Talk: https://youtu.be/J-VrU5JHPlk?si=8aV8acy57NWX30KABloomberg Careers: https://bloomberg.avature.net/careers/SearchJobsChapters00:00 - Introduction: Reliability in a world reshaped by generative AI02:22 - The importance of seamless, background system design04:41 - Becoming a Customer Reliability Engineer at Bloomberg05:17 - Clarifying the CRE role and its customer focus08:02 - The importance of observability and high-scale performance in finance09:00 - Balancing technical and cultural aspects of reliability10:19 - Coaching teams to be proactive using error budgets and SLIs12:21 - The social-technical system: People, processes, and tools13:06 - Mediation of differing opinions on reliability practices15:06 - The nuanced approach to alerting and incident response17:08 - The significance of tiered SLOs and the concept of error budgets21:08 - Using signals like latency, correctness, availability, saturation in system measurement22:53 - The impact of service level "nines" on system design and resilience28:00 - Handling non-determinism and trust in AI responses33:01 - Error budgets and their role in managing deployments34:10 - The challenge of achieving five nines and data durability considerations40:03 - Adapting SLOs for GenAI systems: core principles remain intact42:23 - Measuring non-deterministic AI responses and quality proxies44:41 - The ongoing importance of reliability even in AI/ML contexts47:25 - Reacting to error budget exhaustion and proactive mitigation50:42 - The significance of involving cross-functional teams during outages55:36 - Advocating reliability investment to leadership56:24 - The customer perspective: reliability as a fundamental feature58:42 - Connecting with Sal Furino: where to follow his work and learn more about Bloomberg's engineering culture59:20 - Final advice: Focus on user happiness to avoid common pitfalls in adopting SLOs
    Afficher plus Afficher moins
    54 min
  • Laila: Reinventing Dating as a Social Marketplace with Kaan Divitoğlu
    May 10 2026

    In this episode of Alexa’s Input (AI), I sit down with Kaan Divitoğlu, founder of Laila — a New York based startup rethinking online dating as a social marketplace centered around real plans instead of endless swiping.

    We talk about why traditional dating apps struggle to create real-world connection, how marketplace dynamics shape modern dating behavior, and why Kaan believes the future of dating products is less about “matching soulmates” and more about helping people actually get out on first dates.

    Kaan shares what he’s learned building a product around something emotional, unpredictable, and deeply human: connection.

    We also get into:
    • The metrics behind dating products and user behavior
    • Why most matches never turn into real dates
    • Designing around human psychology and social incentives
    • AI in dating apps — where it helps and where it shouldn’t
    • The process of building Laila
    • Social media growth, creator strategies, and startup distribution
    • Why Kaan thinks apps themselves may eventually disappear


    Links

    Watch: ⁠⁠⁠⁠https://www.youtube.com/@alexa_griffith⁠⁠⁠⁠

    Read: ⁠⁠⁠⁠⁠⁠https://alexasinput.substack.com/⁠⁠⁠⁠⁠⁠

    Listen:⁠⁠ https://creators.spotify.com/pod/profile/alexagriffith/⁠⁠

    More: ⁠⁠⁠⁠https://linktr.ee/alexagriffith⁠⁠⁠⁠


    Learn more about the host at

    Website: ⁠⁠⁠⁠https://alexagriffith.com/⁠⁠⁠⁠

    LinkedIn: ⁠⁠⁠⁠https://www.linkedin.com/in/alexa-griffith/⁠⁠⁠⁠


    Find out more about the guest at:

    LinkedIn: https://www.linkedin.com/in/kaan-divitoglu-152779105/

    Laila Website: https://laila.nyc

    Laila Instagram: https://www.instagram.com/laila.social


    Chapters


    00:00 Introduction to Layla and Its Concept

    04:10 The Journey of Building Layla

    08:43 User Feedback and Validation

    13:35 Metrics of Success in Dating Apps

    18:23 Differentiation in the Dating App Market

    22:54 Understanding User Behavior and Expectations

    27:37 Challenges in the Dating Landscape

    29:50 Loneliness and Social Skills in Modern Dating

    30:51 AI's Role in Dating Apps

    34:20 The Future of Dating Apps and User Experience

    38:19 Building Community Through Events and Social Media

    42:54 Navigating Social Media Marketing

    46:00 Rapid Fire Insights on Dating and Relationships

    53:33 Outro


    Keywords

    dating app, AI, product design, real-world connections, marketplace, user engagement, social media, social tech, startup, innovation

    Afficher plus Afficher moins
    54 min
  • The Creative Founder Mindset with Brady Jordan
    Mar 19 2026

    In this episode, Alexa Griffith interviews Brady Jordan, a creative director and entrepreneur, who shares his journey from aspiring software engineer to the founder of Clip Play Media and the photo app Y2Cam. Brady discusses the intersection of creativity and technology, the importance of storytelling in video production, and the challenges of self-employment. He emphasizes the need for resilience, adaptability, and a consumer-first approach in product development, while also exploring the significance of networking and community building in achieving success.


    Podcast Links

    Watch: ⁠⁠⁠⁠⁠⁠⁠https://www.youtube.com/@alexa_griffith⁠⁠⁠⁠⁠⁠⁠

    Read: ⁠⁠⁠⁠⁠⁠⁠⁠⁠https://alexasinput.substack.com/⁠⁠⁠⁠⁠⁠⁠⁠⁠

    Listen:⁠⁠⁠⁠⁠ https://creators.spotify.com/pod/profile/alexagriffith/⁠⁠⁠⁠⁠

    More Links: ⁠⁠⁠⁠⁠⁠⁠https://linktr.ee/alexagriffith⁠⁠⁠⁠⁠⁠⁠


    Find out more about the host, Alexa Griffith, at:

    Website: ⁠⁠⁠⁠⁠⁠⁠https://alexagriffith.com/⁠⁠⁠⁠⁠⁠⁠

    LinkedIn: ⁠⁠⁠⁠⁠⁠⁠https://www.linkedin.com/in/alexa-griffith/⁠⁠⁠⁠⁠


    Find out more about the guest at:

    Website: https://www.bradyjordan.com/


    Chapters


    00:00 Introduction to Brady Jordan and His Journey

    06:45 The Birth of Clip Play Media

    14:58 Quality vs. Consistency in Content Creation

    24:51 Y2Cam: A Solution to Frustration

    30:51 Cost and Infrastructure of App Development

    35:30 Navigating the Challenges of Self-Employment

    42:51 Marketing Strategies for App Success

    49:04 The Value-Based Approach to Creation

    Afficher plus Afficher moins
    59 min
  • Securing the Software Supply Chain with Justin Cappos
    Feb 17 2026

    Modern software is built on layers and layers of code. So how do we know we can trust it?

    In this episode of Alexa’s Input (AI), Alexa Griffith sits down with Justin Cappos, professor of computer science at NYU and a leading expert in software supply chain security, to unpack what trust really means in today’s digital infrastructure.

    From package managers and dependency chains to large-scale outages and AI systems built on inherited code, Justin explains why many security failures aren’t random accidents, they’re predictable consequences of weak process, misaligned incentives, and insecure design.

    They discuss:

    • Why security only becomes visible when something breaks

    • The difference between unavoidable failure and negligence

    • How modern software supply chains amplify small mistakes

    • The role of leadership and culture in preventing breaches

    • Why verification systems like TUF and in-toto matter more than ever

    As AI accelerates development and increases system complexity, the need for verifiable trust only grows. This episode is a practical look at the invisible infrastructure that keeps modern software, and increasingly, modern AI, from collapsing under its own complexity.


    Podcast Links

    Watch: ⁠⁠⁠⁠⁠⁠https://www.youtube.com/@alexa_griffith⁠⁠⁠⁠⁠⁠

    Read: ⁠⁠⁠⁠⁠⁠⁠⁠https://alexasinput.substack.com/⁠⁠⁠⁠⁠⁠⁠⁠

    Listen:⁠⁠⁠⁠ https://creators.spotify.com/pod/profile/alexagriffith/⁠⁠⁠⁠


    More: ⁠⁠⁠⁠⁠⁠https://linktr.ee/alexagriffith⁠⁠⁠⁠⁠⁠


    Website: ⁠⁠⁠⁠⁠⁠https://alexagriffith.com/⁠⁠⁠⁠⁠⁠

    LinkedIn: ⁠⁠⁠⁠⁠⁠https://www.linkedin.com/in/alexa-griffith/⁠⁠⁠⁠⁠


    Find out more about the guest at:

    Website: https://engineering.nyu.edu/faculty/justin-cappos

    NYU page: https://ssl.engineering.nyu.edu/personalpages/jcappos/

    Wikipedia: https://en.wikipedia.org/wiki/Justin_Cappos



    Chapters


    00:00 Introduction to Justin Cappos and His Work

    01:17 The Importance of Security in Software Systems

    03:50 Understanding Security Breaches: Mistakes vs. System Design Problems

    06:34 Cultural Factors in Security Failures

    09:25 Justin's Journey in Software Security

    12:03 The Role of Academia in Enterprise Security

    14:10 Evaluating Enterprise Security Systems

    16:58 Foundational Projects in Software Security

    19:21 AI Security Concerns and Future Directions

    24:59 The Need for MCP 2.0

    28:57 Security Challenges with LLMs

    32:33 Designing Secure AI Systems

    37:14 Ethical Dilemmas in AI Decision-Making

    40:17 The Role of AI in Open Source

    43:44 Trust and Mindset in AI Security


    Afficher plus Afficher moins
    49 min
  • The Artificial Immune System with Wendy Chin, PureCipher CEO
    Feb 16 2026

    As AI systems grow more autonomous, the question is no longer just what they can do, but whether we can trust the data and models behind their decisions. In this episode of Alexa’s Input (AI), Alexa Griffith talks with Wendy Chin, CEO of PureCipher, about building what she calls an artificial immune system for AI, a framework designed to make data, models, and inference tamper-evident across the AI lifecycle.

    They unpack what data poisoning really means (training data, weights and biases, inference inputs), why small amounts of targeted poison can create outsized model misbehavior, and how generative AI lowers the barrier to sophisticated malware. The conversation expands into the security implications of agent-to-agent communication via MCP, digital twins, and why we don’t have the luxury of “shipping now and securing later.” It’s a wide-ranging discussion that moves from practical threat models to the philosophical frontier of what happens as AI becomes more human-like, and more autonomous.


    Podcast Links

    Watch: ⁠⁠⁠⁠⁠⁠https://www.youtube.com/@alexa_griffith⁠⁠⁠⁠⁠⁠

    Read: ⁠⁠⁠⁠⁠⁠⁠⁠https://alexasinput.substack.com/⁠⁠⁠⁠⁠⁠⁠⁠

    Listen:⁠⁠⁠⁠ https://creators.spotify.com/pod/profile/alexagriffith/⁠⁠⁠⁠


    More: ⁠⁠⁠⁠⁠⁠https://linktr.ee/alexagriffith⁠⁠⁠⁠⁠⁠


    Website: ⁠⁠⁠⁠⁠⁠https://alexagriffith.com/⁠⁠⁠⁠⁠⁠

    LinkedIn: ⁠⁠⁠⁠⁠⁠https://www.linkedin.com/in/alexa-griffith/⁠⁠⁠⁠⁠


    Find out more about the guest at:

    LinkedIn: https://www.linkedin.com/in/wendy-chin-ctg/

    Website: https://www.purecipher.com/


    Chapters

    00:00 Introduction to AI Security

    01:16 Understanding Data Poisoning

    04:38 The Dangers of Malware in AI

    07:46 AI's Moral Dilemmas and Decision Making

    08:45 Building Empathy in AI

    13:07 The Role of Good Data in AI Training

    17:02 PureCypher's Artificial Immune System

    22:34 Digital Twins and Their Implications

    25:22 Nurturing AI Like a Child

    30:53 Data Therapy for AI

    36:13 The Future of AI and Human Interaction

    38:45 The Dark Side of AI: Hacking and Security

    45:03 Global Perspectives on AI Security

    48:11 MCP Agents and Security Concerns

    51:41 Philosophical Implications of AI and Human Connection

    01:00:04 The Sci-Fi Future of AI and Humanity

    Afficher plus Afficher moins
    1 h et 6 min