Couverture de Just Five Mins!

Just Five Mins!

Just Five Mins!

De : Almost 5 minutes!
Écouter gratuitement

À propos de cette écoute

Take a break. Learn something new. Coffee-powered podcasts on tech topics in just five mins (ish!)

www.justfivemins.comDavid Sheardown
Les membres Amazon Prime bénéficient automatiquement de 2 livres audio offerts chez Audible.

Vous êtes membre Amazon Prime ?

Bénéficiez automatiquement de 2 livres audio offerts.
Bonne écoute !
    Épisodes
    • Episode 140 - LLMs Vectors Search and Quantization Explained
      Aug 23 2025

      Okay, a little experiment. Okay, I have been researching how LLMs work with vectors and how AI search works for a while now. But rather than me warble on for a while (and more than 5 mins!), I have tried an AI approach to explain the research ;)

      In fairness, I did listen back to this, and personally I was quite impressed. The explanation about quantization methods was particularly useful.

      If vectors, AI search and quantization mean nothing to you - take a listen :)

      Hey, this is a free podcast. However, if you feel you want to support me then check out Patreon. I will have some more detailed deep dives for Patreon members as well as one-to-one sessions.

      Or just buy a unicorn a coffee here!

      Oh, and yes, I have ended up on YouTube (doesn’t everyone eventually?):

      https://www.youtube.com/@justfifteenmins but don’t worry, no ugly face to worry about (yet!).



      Get full access to Just Five Mins! at www.justfivemins.com/subscribe
      Afficher plus Afficher moins
      13 min
    • Episode 139 - RAG is Expensive but is it really
      Aug 3 2025

      🧠 What RAG Actually Does

      RAG enhances LLMs by retrieving relevant external information (e.g. from documents or databases) at query time, then feeding that into the prompt. This allows the LLM to answer with up-to-date or domain-specific knowledge without retraining.

      💸 Is RAG Expensive?

      Yes, it can be — especially if:

      * You repeatedly reprocess large documents for every query.

      * You use high token counts to include raw content in prompts.

      * You rely on real-time parsing of files (e.g. PDFs or Excel) without preprocessing.

      This is where vector storage and embedding optimization come in.

      📦 Role of Vector Storage

      Instead of reloading and reprocessing documents every time:

      * Documents are chunked into smaller segments.

      * Each chunk is converted into a vector embedding.

      * These embeddings are stored in a vector database (e.g. FAISS, Pinecone, Weaviate).

      * At query time, the user’s question is embedded and matched against stored vectors to retrieve relevant chunks.

      This avoids reprocessing the original files and drastically reduces cost and latency

      ⚙️ Efficiency Strategies

      Here’s how to make RAG more efficient:

      Strategy

      Description

      Benefit

      Vector Storage

      Store precomputed embeddings

      Avoids repeated parsing and embedding

      ANN Indexing

      Use Approximate Nearest Neighbor search

      Fast retrieval from large datasets

      Quantization

      Compress embeddings (e.g. float8, int8)

      Reduces memory footprint with minimal accuracy loss

      Dimensionality Reduction

      Use PCA or UMAP to reduce vector size

      Speeds up search and lowers storage cost

      Contextual Compression

      Filter retrieved chunks before sending to LLM

      Reduces token usage and cost



      Get full access to Just Five Mins! at www.justfivemins.com/subscribe
      Afficher plus Afficher moins
      13 min
    • Episode 138 - UX Pilot UI Design with AI
      Jul 23 2025

      Design for UI/UX is obviously an art form, but can AI do as good a job or better? or as the case may well be, using AI to help with the tedious stuff?

      UX Pilot

      Figma

      Hey, this is a free podcast. However, if you feel you want to support me then check out Patreon. I will have some more detailed deep dives for Patreon members as well as one-to-one sessions.

      Or just buy a unicorn a coffee here!

      Oh, and yes, I have ended up on YouTube (doesn’t everyone eventually?):

      https://www.youtube.com/@justfifteenmins but don’t worry, no ugly face to worry about (yet!).



      Get full access to Just Five Mins! at www.justfivemins.com/subscribe
      Afficher plus Afficher moins
      6 min
    Aucun commentaire pour le moment