Épisodes

  • #006 - The Subtle Art of Inference with Adam Grzywaczewski
    Jan 22 2026

    In this episode of The Private AI Lab, Johan van Amersfoort speaks with Adam Grzywaczewski, a senior Deep Learning Data Scientist at NVIDIA, about the rapidly evolving world of AI inference.


    They explore how inference has shifted from simple, single-GPU execution to highly distributed, latency-sensitive systems powering today’s large language models. Adam explains the real bottlenecks teams face, why software optimization and hardware innovation must move together, and how NVIDIA’s inference stack—from TensorRT-LLM to Dynamo—enables scalable, cost-efficient deployments.


    The conversation also covers quantization, pruning, mixture-of-experts models, AI factories, and why inference optimization is becoming one of the most critical skills in modern AI engineering.


    Topics covered


    • Why inference is now harder than training

    • Autoregressive models and KV-cache challenges

    • Mixture-of-experts architectures

    • NVIDIA Dynamo and TensorRT-LLM

    • Hardware vs software optimization

    • Quantization, pruning, and distillation

    • Latency vs throughput trade-offs

    • The rise of AI factories and DGX systems

    • What’s next for AI inference

    Afficher plus Afficher moins
    53 min
  • #005 - The Why, What, and How of MCP with Maxime Colomès
    Jan 8 2026

    In this episode of The Private AI Lab, Johan van Amersfoort talks with Maxime Colomès about the Model Context Protocol (MCP)—one of the most important emerging standards in AI today.


    MCP is often described as the USB-C of AI: a universal way for AI models to connect to tools, data sources, and real-world systems. Maxime explains what MCP is, how it works, and why its recent donation to the Linux Foundation is such a major milestone for the AI ecosystem.


    They explore real-world enterprise use cases, MCP security considerations, private AI architectures, and how MCP integrates with platforms like OpenShift AI. The conversation also touches on developer productivity, AI agents that can take action, and the future of personal, privacy-preserving AI assistants.


    Key topics


    • What the Model Context Protocol (MCP) is and why it matters

    • MCP vs traditional APIs and plugin systems

    • Enterprise MCP architectures and gateways

    • MCP and private AI / data sovereignty

    • OpenShift AI and MLOps workflows

    • Security risks and best practices with MCP

    • Community MCP servers and registries

    • Future MCP use cases and predictions

    Afficher plus Afficher moins
    1 h et 3 min
  • #004 - The Past, Present, and Future of VMware Private AI Services with Frank Denneman
    Dec 23 2025

    In this episode of The Private AI Lab, Johan sits down with Frank Denneman to explore the past, present, and future of VMware’s Private AI portfolio.

    This conversation goes beyond AI buzzwords and marketing fluff. Together, Johan and Frank dive deep into the real infrastructure and resource management challenges that emerge when AI workloads enter enterprise environments. GPUs, scheduling, isolation, and platform design all take center stage—viewed through the lens of real-world VMware deployments.

    If you are an infrastructure architect, platform engineer, or IT decision-maker designing AI behind the firewall, this episode provides grounded insights into what actually matters.


    🔍 What you’ll learn in this episode


    • How VMware’s Private AI strategy has evolved over time

    • Why AI workloads fundamentally change infrastructure assumptions

    • The importance of resource management for GPU-backed workloads

    • Key architectural trade-offs when running AI on-prem

    • How to think about the future of enterprise AI platforms


    🎧 Listen & Subscribe


    For more experiments, insights, and behind-the-firewall AI discussions, visit johan.ml.


    Experiment complete. Until the next one — stay curious.

    Afficher plus Afficher moins
    1 h et 1 min
  • #003 - OpenShift AI, DGX Spark & the Future of Private AI — with Robbie Jerrom (Red Hat)
    Dec 11 2025

    This episode of The Private AI Lab features Robbie Jerrom, Principal Technologist AI at Red Hat, for a deep dive into Private AI, from the DGX Spark to OpenShift AI and the future of agentic systems.


    Topics we cover:


    • How Robbie uses the DGX Spark for home-lab AI

    • Why developers are moving from cloud GPUs to local devices

    • OpenShift AI as a consistent platform from experiment to production

    • The best open-source components for modern AI stacks

    • Why 79% of POCs never reach production — and how to avoid that

    • The next wave: agentic AI and enterprise automation


    Watch on YouTube:

    https://www.youtube.com/watch?v=jjyB8w_cpb0


    More episodes & articles at johan.ml

    Afficher plus Afficher moins
    1 h et 3 min
  • #002 - DGX Spark Review — With Andrew Foe (Iodis & HyperAI)
    Nov 27 2025

    In this episode of The Private AI Lab, Johan is joined by Andrew Foe, CEO of Iodis and HyperAI, to explore the NVIDIA DGX Spark from a real-world perspective.


    We cover:


    • Unboxing & first impressions

    • Bring-up and setup tips

    • Everyday usability

    • What customers love

    • The most common misconceptions

    • Why preorder demand exploded in the BeNeLux region


    A must-listen for anyone exploring Private AI hardware or considering the DGX Spark.


    🎧 Watch on YouTube: https://www.youtube.com/watch?v=jENCTgcAWsI

    💡 More experiments: https://johan.ml

    Afficher plus Afficher moins
    1 h et 1 min
  • #001 - Fast, Sovereign, and Local: What SUSE AI Taught Me in 40 Minutes
    Nov 13 2025

    In the premiere of The Private AI Lab, Johan van Amersfoort is joined by SUSE AI Specialist Eric Lajoie to talk about sovereign AI, deploying chatbots in under an hour, and the unexpected hazards of robotic dogs.Timestamps:

    00:00 – Intro & guest welcome

    01:04 – How to pronounce “Lajoie”?

    02:00 – What Eric actually does at SUSE

    05:14 – Eric’s AI fail: a very personal RAG demo

    10:06 – What Private AI means to Eric

    11:30 – How SUSE AI works (chatbots, Kubernetes, Rancher, and more)

    21:30 – RAG architecture explained

    26:00 – 40-minute SUSE AI deployment?!

    31:00 – Real-world use cases (GPUaaS, observability, MCP)

    39:00 – AI sovereignty in Europe

    43:00 – Wrap-up & key takeaways

    54:00 – Eric’s prediction for the next 12 months in AI

    57:40 – Johan’s robot dog fail

    60:00 – Outro + where to follow Eric

    🔗 Links & Mentions • Learn more about SUSE AI: https://suse.com

    Follow Eric: https://www.linkedin.com/in/elajoie/ or lajoie.de

    Check out the companion post at https://johan.ml/

    Related episode: When Shit Hits The Fan ft. Eric – https://open.spotify.com/episode/2IeLq8WT7eqMJPVkcUMg3G?si=eaf7a529a75e4078

    Afficher plus Afficher moins
    58 min
  • The Private AI Lab
    Oct 7 2025

    Welcome to The Private AI Lab — the podcast where we experiment, explore, and debate the future of Artificial Intelligence behind the firewall. I’m your host, Johan, and every month I invite a guest into the lab to break down real-world use cases, challenges, and innovations shaping Private AI. Brought to you by Johan.ml


    Be the first to know when a new episode drops, by subscribing to the podcast!

    Afficher plus Afficher moins
    1 min