Épisodes

  • 📔 SynthID-Text: Scalable Watermarking for Large Language Model Outputs
    Apr 16 2025

    The provided research paper introduces SynthID-Text, a novel and scalable method for watermarking the output of large language models (LLMs). This technique aims to address the challenge of identifying AI-generated text in a way that preserves text quality and maintains computational efficiency. The authors detail the algorithm's design, implementation, and evaluation, demonstrating its superior detectability compared to existing watermarking schemes. Furthermore, the paper highlights the successful integration and live deployment of SynthID-Text within Google's Gemini models, marking a significant step towards responsible LLM usage.

    Afficher plus Afficher moins
    32 min
  • 🤖 Inference-Time Scaling for Generalist Reward Modeling
    Apr 10 2025

    This paper explores enhancing reward modeling (RM) for large language models (LLMs) by improving inference-time scalability. The authors introduce Self-Principled Critique Tuning (SPCT), a novel learning method that encourages RMs to generate their own guiding principles and accurate critiques through online reinforcement learning. Their approach, embodied in the DeepSeek-GRM models, utilizes pointwise generative reward modeling for greater flexibility. By employing parallel sampling and a meta RM to refine the reward voting process, they demonstrate significant improvements in the quality and scalability of their GRMs across various benchmarks. Notably, inference-time scaling with their method shows competitive or superior performance compared to simply increasing model size.

    Afficher plus Afficher moins
    18 min
  • 🛡️ CaMeL: Defeating Prompt Injections with Capability-Based Security
    Apr 8 2025

    The provided document introduces CaMeL, a novel security defence designed to protect Large Language Model (LLM) agents from prompt injection attacks that can occur when they process untrusted data. CaMeL operates by creating a protective layer around the LLM, explicitly separating and tracking the control and data flows originating from trusted user queries, thus preventing malicious untrusted data from manipulating the program's execution. This system employs a custom Python interpreter to enforce security policies and prevent unauthorised data exfiltration, using a concept of "capabilities" to manage data flow. Evaluated on the AgentDojo benchmark, CaMeL demonstrated a significant reduction in successful attacks compared to models without it and other existing defence mechanisms, often with minimal impact on the agent's ability to complete tasks.

    Afficher plus Afficher moins
    24 min
  • ☁️ SkyServe: Spot Instance AI Model Serving Across Clouds
    Apr 6 2025

    Serving demanding AI models cost-effectively and reliably is challenging due to GPU expenses and service requirements. This paper introduces SpotHedge, a policy that intelligently uses discounted spot instances across different cloud regions to lower costs while maintaining high availability. The system built upon this policy, SkyServe, dynamically manages a mix of spot and on-demand replicas, proactively hedging against spot instance preemptions and unavailability. Evaluations show SkyServe significantly reduces costs and improves latency compared to existing solutions by diversifying resources and adapting to market conditions. This work demonstrates the feasibility of using spot instances for AI model serving without compromising service quality.

    Afficher plus Afficher moins
    23 min
  • 🔬 BIG-bench: Quantifying Language Model Capabilities
    Apr 6 2025

    This document introduces BIG-bench, a large and diverse benchmark designed to evaluate the capabilities of large language models across over two hundred challenging tasks. It highlights the limitations of existing benchmarks and argues for the necessity of more comprehensive assessments to understand the transformative potential of these models. The paper presents performance results for various models, including Google's BIG-G and OpenAI's GPT, alongside human rater baselines, revealing that while model performance generally improves with scale, it remains below human levels. Furthermore, the research explores aspects like model calibration, the impact of task phrasing, and the presence of social biases, offering insights into the strengths and weaknesses of current language models.

    Afficher plus Afficher moins
    19 min
  • 🤖 AlphaDev: AI Learns Faster Sorting Algorithms
    Apr 6 2025

    This research introduces AlphaDev, a novel deep reinforcement learning agent, that discovers more efficient sorting algorithms than existing human-developed methods by formulating the search as a single-player game at the CPU instruction level. AlphaDev outperformed benchmarks for small sorts (3-5 elements) and variable-length sorts by optimising for actual measured latency, leading to new algorithmic discoveries like the "swap move" and "copy move". These optimised sorting routines have been integrated into the LLVM standard C++ library, demonstrating a real-world impact on widely used software. The study also compares AlphaDev to stochastic search methods, highlighting its superior exploration capabilities in complex algorithmic spaces, especially those involving branching.

    Afficher plus Afficher moins
    17 min
  • 🧠 LLMs and Multi-Hop Queries: A Latent Reasoning Analysis
    Apr 4 2025

    One study investigates whether a chess-playing neural network, Leela, learns look-ahead capabilities, finding evidence that it internally represents and uses future moves in its decision-making, employing mechanisms like activation patching and attention analysis to support this. Another paper explores the limitations of large language models (LLMs) in multi-hop reasoning, hypothesising a sequential "knowledge-extraction module" and using a technique called Patchscopes to locate these processes within the network, alongside back-patching experiments to further understand the model's reasoning pathway. Finally, the third source examines if LLMs share representations of grammatical concepts across diverse languages by training sparse autoencoders and using causal interventions, revealing that abstract grammatical concepts are often encoded in shared feature directions, suggesting a degree of language-independent understanding.

    Afficher plus Afficher moins
    15 min
  • 🧠 QLORA: Efficient Finetuning of Quantized Large Language Models
    Apr 2 2025

    The research introduces QLORA, a novel method for efficient finetuning of large language models by quantising pretrained models to 4-bit and using Low-Rank Adapters. This approach drastically reduces memory usage, enabling the finetuning of models with up to 65 billion parameters on a single 48GB GPU while maintaining 16-bit finetuning performance. Key innovations include the 4-bit NormalFloat (NF4) data type, double quantisation, and paged optimisers to manage memory. Using QLORA, the authors developed Guanaco, a family of models that achieves competitive performance with ChatGPT on the Vicuna benchmark and demonstrates state-of-the-art chatbot capabilities. The paper also examines the importance of data quality over quantity in finetuning and provides an analysis of chatbot evaluation methods, including a comparison between human and GPT-4 assessments.

    Afficher plus Afficher moins
    16 min