Couverture de The Private AI Lab

The Private AI Lab

The Private AI Lab

De : Johan van Amersfoort
Écouter gratuitement

3 mois pour 0,99 €/mois

Après 3 mois, 9.95 €/mois. Offre soumise à conditions.

À propos de ce contenu audio

The Private AI Lab is a monthly podcast where we explore the future of Artificial Intelligence behind the firewall. Hosted by Johan from Johan.ml, each episode invites industry experts, innovators, and thought leaders to discuss how Private AI is reshaping enterprises, technology, and society. From data sovereignty to air-gapped deployments, from GPUs to governance — this podcast uncovers the real-world experiments, failures, and breakthroughs that define the era of Private AI. 🎙️ New episode every month. 🌐 More at Johan.mlJohan van Amersfoort
Les membres Amazon Prime bénéficient automatiquement de 2 livres audio offerts chez Audible.

Vous êtes membre Amazon Prime ?

Bénéficiez automatiquement de 2 livres audio offerts.
Bonne écoute !
    Épisodes
    • #006 - The Subtle Art of Inference with Adam Grzywaczewski
      Jan 22 2026

      In this episode of The Private AI Lab, Johan van Amersfoort speaks with Adam Grzywaczewski, a senior Deep Learning Data Scientist at NVIDIA, about the rapidly evolving world of AI inference.


      They explore how inference has shifted from simple, single-GPU execution to highly distributed, latency-sensitive systems powering today’s large language models. Adam explains the real bottlenecks teams face, why software optimization and hardware innovation must move together, and how NVIDIA’s inference stack—from TensorRT-LLM to Dynamo—enables scalable, cost-efficient deployments.


      The conversation also covers quantization, pruning, mixture-of-experts models, AI factories, and why inference optimization is becoming one of the most critical skills in modern AI engineering.


      Topics covered


      • Why inference is now harder than training

      • Autoregressive models and KV-cache challenges

      • Mixture-of-experts architectures

      • NVIDIA Dynamo and TensorRT-LLM

      • Hardware vs software optimization

      • Quantization, pruning, and distillation

      • Latency vs throughput trade-offs

      • The rise of AI factories and DGX systems

      • What’s next for AI inference

      Afficher plus Afficher moins
      53 min
    • #005 - The Why, What, and How of MCP with Maxime Colomès
      Jan 8 2026

      In this episode of The Private AI Lab, Johan van Amersfoort talks with Maxime Colomès about the Model Context Protocol (MCP)—one of the most important emerging standards in AI today.


      MCP is often described as the USB-C of AI: a universal way for AI models to connect to tools, data sources, and real-world systems. Maxime explains what MCP is, how it works, and why its recent donation to the Linux Foundation is such a major milestone for the AI ecosystem.


      They explore real-world enterprise use cases, MCP security considerations, private AI architectures, and how MCP integrates with platforms like OpenShift AI. The conversation also touches on developer productivity, AI agents that can take action, and the future of personal, privacy-preserving AI assistants.


      Key topics


      • What the Model Context Protocol (MCP) is and why it matters

      • MCP vs traditional APIs and plugin systems

      • Enterprise MCP architectures and gateways

      • MCP and private AI / data sovereignty

      • OpenShift AI and MLOps workflows

      • Security risks and best practices with MCP

      • Community MCP servers and registries

      • Future MCP use cases and predictions

      Afficher plus Afficher moins
      1 h et 3 min
    • #004 - The Past, Present, and Future of VMware Private AI Services with Frank Denneman
      Dec 23 2025

      In this episode of The Private AI Lab, Johan sits down with Frank Denneman to explore the past, present, and future of VMware’s Private AI portfolio.

      This conversation goes beyond AI buzzwords and marketing fluff. Together, Johan and Frank dive deep into the real infrastructure and resource management challenges that emerge when AI workloads enter enterprise environments. GPUs, scheduling, isolation, and platform design all take center stage—viewed through the lens of real-world VMware deployments.

      If you are an infrastructure architect, platform engineer, or IT decision-maker designing AI behind the firewall, this episode provides grounded insights into what actually matters.


      🔍 What you’ll learn in this episode


      • How VMware’s Private AI strategy has evolved over time

      • Why AI workloads fundamentally change infrastructure assumptions

      • The importance of resource management for GPU-backed workloads

      • Key architectural trade-offs when running AI on-prem

      • How to think about the future of enterprise AI platforms


      🎧 Listen & Subscribe


      For more experiments, insights, and behind-the-firewall AI discussions, visit johan.ml.


      Experiment complete. Until the next one — stay curious.

      Afficher plus Afficher moins
      1 h et 1 min
    Aucun commentaire pour le moment