Couverture de Semi Doped

Semi Doped

Semi Doped

De : Vikram Sekar and Austin Lyons
Écouter gratuitement

3 mois pour 0,99 €/mois

Après 3 mois, 9.95 €/mois. Offre soumise à conditions.

À propos de ce contenu audio

The business and technology of semiconductors. Alpha for engineers and investors alike.

© 2026 Semi Doped
Les membres Amazon Prime bénéficient automatiquement de 2 livres audio offerts chez Audible.

Vous êtes membre Amazon Prime ?

Bénéficiez automatiquement de 2 livres audio offerts.
Bonne écoute !
    Épisodes
    • Nvidia CES 2026
      Jan 12 2026

      Episode Summary

      Austin and Vik break down NVIDIA’s CES 2026 keynote, focusing on Vera Rubin, DGX Spark and DGX Station, uneducated investor panic, and physical AI.

      Key Takeaways

      • DGX Spark brings server-class NVIDIA architecture to the desktop at low power, aimed at developers, enthusiasts, and enterprises experimenting locally.
      • DGX Station functions more like a mini-AI rack on-prem: Grace Blackwell for inference and development without full racks
      • The historical parallel is mainframes to minicomputers, expanding compute TAM rather than displacing cloud usage.
      • On-prem AI converts some GPU rental OpEx into CapEx, appealing to CFOs
      • NVIDIA positioned autonomy as physical AI with vision-language-action models and early Mercedes-Benz deployments in 2026.
      • Vera Rubin integrates CPU, GPU, DPU, networking, and photonics into a single platform, emphasizing Ethernet for scale-out. (Where was the Infiniband switch?)
      • The new Vera CPU highlights rising CPU importance for agentic workloads through higher core counts, SMT, and large LPDDR capacity.
      • Rubin GPU’s move to HBM4 and adaptive precision targets inference efficiency gains and lower cost per token.
      • Context memory storage elevates SSDs and DPUs, enabling massive KV cache offload beyond HBM and DRAM.
      • Cable-less rack design and warm-water cooling show NVIDIA’s shift from raw performance toward manufacturability and enterprise polish.
      Afficher plus Afficher moins
      47 min
    • Insights from IEDM 2025
      Jan 8 2026

      Austin and Vik discuss key insights from the IEDM conference.

      They explore the significance of IEDM for engineers and investors, the networking opportunities it offers, and the latest innovations in silicon photonics, complementary FETs, NAND flash memory, and GaN-on-silicon chiplets.

      Takeaways

      • Penta-level NAND flash memory could disrupt the SSD market
      • GaN-on-Silicon chiplets enhance power efficiency
      • Complementary FETs
      • Optical scale-up has a power problem
      • The future of transistors is still bright


      Afficher plus Afficher moins
      42 min
    • Nvidia "Acquires" Groq
      Jan 5 2026

      Key Topics

      • What Nvidia actually bought from Groq and why it is not a traditional acquisition
      • Why the deal triggered claims that GPUs and HBM are obsolete
      • Architectural trade-offs between GPUs, TPUs, XPUs, and LPUs
      • SRAM vs HBM. Speed, capacity, cost, and supply chain realities
      • Groq LPU fundamentals: VLIW, compiler-scheduled execution, determinism, ultra-low latency
      • Why LPUs struggle with large models and where they excel instead
      • Practical use cases for hyper-low-latency inference:
        • Ad copy personalization at search latency budgets
        • Model routing and agent orchestration
        • Conversational interfaces and real-time translation
        • Robotics and physical AI at the edge
        • Potential applications in AI-RAN and telecom infrastructure
      • Memory as a design spectrum: SRAM-only, SRAM plus DDR, SRAM plus HBM
      • Nvidia’s growing portfolio approach to inference hardware rather than one-size-fits-all

      Core Takeaways

      • GPUs are not dead. HBM is not dead.
      • LPUs solve a different problem: deterministic, ultra-low-latency inference for small models.
      • Large frontier models still require HBM-based systems.
      • Nvidia’s move expands its inference portfolio surface area rather than replacing GPUs.
      • The future of AI infrastructure is workload-specific optimization and TCO-driven deployment.


      Afficher plus Afficher moins
      41 min
    Aucun commentaire pour le moment