#006 - The Subtle Art of Inference with Adam Grzywaczewski

Impossible d'ajouter des articles

Désolé, nous ne sommes pas en mesure d'ajouter l'article car votre panier est déjà plein.

Veuillez réessayer plus tard

Échec de l’élimination de la liste d'envies.

Veuillez réessayer plus tard

Impossible de suivre le podcast

Impossible de ne plus suivre le podcast

#006 - The Subtle Art of Inference with Adam Grzywaczewski

Écouter gratuitement

Voir les détails

3 mois pour 0,99 €/mois

Après 3 mois, 9.95 €/mois. Offre soumise à conditions.

À propos de ce contenu audio

In this episode of The Private AI Lab, Johan van Amersfoort speaks with Adam Grzywaczewski, a senior Deep Learning Data Scientist at NVIDIA, about the rapidly evolving world of AI inference.

They explore how inference has shifted from simple, single-GPU execution to highly distributed, latency-sensitive systems powering today’s large language models. Adam explains the real bottlenecks teams face, why software optimization and hardware innovation must move together, and how NVIDIA’s inference stack—from TensorRT-LLM to Dynamo—enables scalable, cost-efficient deployments.

The conversation also covers quantization, pruning, mixture-of-experts models, AI factories, and why inference optimization is becoming one of the most critical skills in modern AI engineering.

Topics covered

Why inference is now harder than training
Autoregressive models and KV-cache challenges
Mixture-of-experts architectures
NVIDIA Dynamo and TensorRT-LLM
Hardware vs software optimization
Quantization, pruning, and distillation
Latency vs throughput trade-offs
The rise of AI factories and DGX systems
What’s next for AI inference

Les membres Amazon Prime bénéficient automatiquement de 2 livres audio offerts chez Audible.

Vous êtes membre Amazon Prime ?

Bénéficiez automatiquement de 2 livres audio offerts.
Bonne écoute !

Aucun commentaire pour le moment

SÉLECTION

#006 - The Subtle Art of Inference with Adam Grzywaczewski

Impossible d'ajouter des articles

Échec de l’élimination de la liste d'envies.

Impossible de suivre le podcast

Impossible de ne plus suivre le podcast

#006 - The Subtle Art of Inference with Adam Grzywaczewski

À propos de ce contenu audio

Vous êtes membre Amazon Prime ?

Les Top 10

Prix littéraires

Écoutez en illimité