Épisodes

  • AI Agents Under Fire, LLM Bias Runs Deep, and a Wizard of Oz Fail: The AI Argument EP68
    Aug 4 2025

    AI agents crumble faster than wet cardboard when under attack. A recent study proved it. Every single agent tested failed against prompt injections. That’s a 100% failure rate.

    Justin sees this as a fixable engineering problem with smart design and strict access controls.

    Frank isn’t convinced. Real-world complexity means isolation isn’t that simple.

    And while Justin rails against regulation, Frank points to the EU’s looming rules as a possible safety net.

    The bigger takeaway? Businesses racing to deploy open-ended agents could be building ticking time bombs. The safer bet might be narrow, well-scoped agents that automate specific tasks. But will hype win over common sense?

    From there, the debate shifts to a study exposing bias in LLMs. It found they recommend lower salaries for women and minority groups. Can removing personal details fix the problem, or is the bias baked in?

    Then it takes a technical turn with Chinese researchers using LLMs to design stronger models, before veering into the unexpected: a football club handing legal contracts to AI and a Wizard of Oz remake that left Vegas audiences unimpressed.

    02:12 Can any AI agent survive a prompt attack?
    14:51 Is AI quietly spreading bias everywhere?
    25:19 Are LLMs now designing better LLMs?
    29:32 Did United just make AI their star player?
    31:13 Did AI butcher the Wizard of Oz in Vegas?

    ► LINKS TO CONTENT WE DISCUSSED

    • Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition
    • Salary advice from AI low-balls women and minorities, says new report
    • "AlphaGo Moment" For Self Improving AI... can this be real?
    • Cambridge United partners with Genie AI to adopt AI for contract management
    • Is The Wizard of Oz With Generative AI Still The Wizard of Oz?


    ► CONNECT WITH US
    For more in-depth discussions, connect Justin and Frank on LinkedIn.
    Justin: https://www.linkedin.com/in/justincollery/
    Frank: https://www.linkedin.com/in/frankprendergast/

    Afficher plus Afficher moins
    35 min
  • EU Code of Conduct Clash, Zuck’s Big Bucks, and Model Owl Bias: The AI Argument EP67
    Jul 28 2025

    A €300 million AI investment vanished overnight—and Justin says it’s a warning Europe is sleepwalking into irrelevance. Because while the US plans nuclear power and light-touch rules, the EU is doubling down on regulation and failing to build the energy infrastructure AI needs.

    Frank argues regulation isn’t a handicap, it’s Europe’s best shot at leadership, setting the stage for global guardrails while others race blindly ahead.

    Either way, Anthropic predicts training a frontier model could soon require up to five gigawatts of power, the same energy it takes to run millions of homes. Europe isn’t building that capacity. The US is.

    And that’s just the start.

    From Zuckerberg offering billion-dollar contracts to the cultural showdown between OpenAI and Google, this one packs a lot in.

    We also dive into how synthetic data can secretly pass on biases, why academic peer review might be gamed by prompt injections, and even LinkedIn’s bot problem.

    → 00:57 Why isn’t Amazon building its AI facility in Ireland?
    → 02:54 Will EU rules choke AI or make us leaders?
    → 14:39 Can Zuckerberg buy his way to AI dominance?
    → 20:37 Google vs OpenAI: who aced the math olympiad?
    → 29:44 Can AI bias spread through random numbers?
    → 35:01 Is AI gaming peer review AND your LinkedIn feed?

    ► SUBSCRIBE
    Don't forget to subscribe to get all the latest arguments.

    ► LINKS TO CONTENT WE DISCUSSED

    • Amazon drops €300m Irish investment on energy supply concerns
    • Anthropic: Build AI in America
    • Meta won’t sign EU’s AI Code, but who will?
    • The Epic Battle for AI Talent—With Exploding Offers, Secret Deals and Tears
    • Google Takes the Gold. OpenAI under fire.
    • A new study just upended AI safety
    • ICML’s Statement about subversive hidden LLM prompts


    ► CONNECT WITH US
    For more in-depth discussions, connect Justin and Frank on LinkedIn.
    Justin: https://www.linkedin.com/in/justincollery/
    Frank: https://www.linkedin.com/in/frankprendergast/

    Afficher plus Afficher moins
    41 min
  • ChatGPT Agent Surprise, Coding Agent Fail, and Elon’s Latest Stunts: The AI Argument EP66
    Jul 23 2025

    OpenAI just dropped a model that can plan a wedding trip, pick the perfect gift, and shop for shoes for you. The Agent update lets ChatGPT take a single instruction, break it into subtasks, and go off to handle all the details.

    They called it their most powerful model yet. So why did the launch feel so muted?

    Justin has theories.

    And there were plenty of other big topics to cover - Justin asks whether small AI systems hooked up to real-world labs create bigger risks than giant language models but slip past EU regulations?

    We also look at Perplexity’s move into AI-powered browsing, ask why coding agents sometimes make developers slower instead of faster, and then, of course, there’s Elon Musk.

    What has Elon been up to? Turned Tesla’s latest expansion into a drawing of a you-know-what, turned Grok into an anime burlesque dancer (to put it politely) and still managed to land a massive DoD contract.

    Here’s the full set of questions we tackled:

    03:24 OpenAI's Agent model drop... why so quiet?
    11:19 Can small AI slip past EU regulators?
    16:14 Which secret model did OpenAI test here?
    19:01 Should you trust AI with your credit card?
    21:46 Perplexity's AI browser, game-changer or gimmick?
    24:02 Do coding agents actually make you slower?
    27:12 What fresh madness has Elon cooked up now?

    ► LINKS TO CONTENT WE DISCUSSED

    • Introducing ChatGPT agent: bridging research and action
    • This AI-powered lab runs itself—and discovers new materials 10x faster
    • AI finds hundreds of potential antibiotics in snake and spider venom
    • OpenAI's Secret INTERNAL Model Almost Wins World Coding Competition…
    • Perplexity’s Comet is here, and after using it for 48 hours I’m convinced AI web browsers are the future of the internet
    • What Actually Happens When Programmers Use AI Is Hilarious, According to a New Study
    • New Grok AI model surprises experts by checking Elon Musk’s views before answering
    • Grok Rolls Out Pornographic Anime Companion, Lands Department of Defense Contract
    • Tesla's expanded Robotaxi geofence in Austin has a very distinct shape. OK, it's a giant penis.


    ► CONNECT WITH US
    For more in-depth discussions, connect Justin and Frank on LinkedIn.
    Justin: https://www.linkedin.com/in/justincollery/
    Frank: https://www.linkedin.com/in/frankprendergast/

    ► YOUR INPUT
    Would you trust an AI agent with your credit card?

    Afficher plus Afficher moins
    32 min
  • Grok Crashes and Conquers, AI’s Cash Bonfire, and a Murderous Safety Cult: The AI Argument EP65
    Jul 14 2025

    Elon Musk’s AI, Grok, crashed into controversy, then crushed the competition all within hours.

    First, Grok 3 started praising Hitler. Then Grok 4 showed up and aced nearly every AI test.

    Justin serves up a juicy conspiracy theory: was Grok’s hateful public meltdown actually a cunning Musk masterplan, a dramatic stunt to expose AI's darker side?

    Frank’s having none of it, comparing Musk to Marvel’s Tony Stark in the Age Of Ultron. Well-meaning but recklessly creating an AI menace he can't actually control.

    But Grok 4 is legitimately groundbreaking. Justin gets excited about Grok’s unique 50/50 balance between pre-training and post-training. But despite its brainy brilliance, both Justin and Frank agree they'd rather eat their keyboards than trust Grok with anything important.

    If you run a business or you're simply watching AI from a safe distance with popcorn, this episode is essential. Especially if you like a dose of humour with your tech debates.

    Grok drama aside, Justin and Frank get stuck into more eyebrow-raising AI headlines from the week, including:

    • Why did Musk’s Grok spew hate speech?
    • Is Grok 4 now the smartest AI out there?
    • Will AI crash like subprime mortgages?
    • Did Marco Rubio’s AI clone scam top politicians?
    • Did AI safety fears just spark a murder cult?

    #GrokAI #ElonMuskAI #AISafety #Grok4 #AIBenchmark #XAI #VoiceCloning #AIEthics

    ► LINKS TO CONTENT WE DISCUSSED

    • Musk says Grok chatbot was 'manipulated' into praising Hitler
    • Grok 4 is really smart... Like REALLY SMART
    • OpenAI May Be in Major Trouble Financially
    • AI scammer posing as Marco Rubio targets officials in growing threat
    • She Wanted to Save the World From A.I. Then the Killings Started.


    ► CONNECT WITH US
    For more in-depth discussions, connect Justin and Frank on LinkedIn.
    Justin: https://www.linkedin.com/in/justincollery/
    Frank: https://www.linkedin.com/in/frankprendergast/

    Afficher plus Afficher moins
    36 min
  • Claude’s Shop Flop, Mistral vs EU Regs, Adult Industry’s AI Love: The AI Argument EP64
    Jul 7 2025

    Claude ran a shop for a month and operated at a loss, cheerfully handing out discounts, hallucinating suppliers, and generously giving away stock. Turns out even "smart" AI can be a bit of a soft touch.

    Frank’s curious what Anthropic can do for Claude’s performance with some careful fine-tuning and a database memory, but Justin’s sure today's agents need a fundamental leap, some genuine self-improving smarts, before they’re ready to take on a complete role.

    Today's AI agents clearly crumble under complex, long-horizon tasks. For business owners dreaming about replacing employees, this reality check is essential listening.

    Frank and Justin also discuss why Mistral is pushing to pause the EU AI Act, and examine how the adult entertainment sector is putting AI to work.

    → Is agentic AI just hype and no help?
    → What happens when Claude runs a shop?
    → Why does Mistral want the EU AI Act paused?
    → Why is the adult industry loving AI?

    #AI #AIAgents #ProjectVend #AnthropicAI #AIExperiments #AutomationFail #AIWinter #AITech

    ► LINKS TO CONTENT WE DISCUSSED

    • The Percentage of Tasks AI Agents Are Currently Failing At May Spell Trouble for the Industry
    • Project Vend: Can Claude run a small shop? (And why does that matter?)
    • EU says it will continue rolling out AI legislation on schedule
    • A Pro-Russia Disinformation Campaign Is Using Free AI Tools to Fuel a ‘Content Explosion’
    • LLMs are optimizing the adult industry

    ► CONNECT WITH US
    For more in-depth discussions, connect Justin and Frank on LinkedIn.
    Justin: https://www.linkedin.com/in/justincollery/
    Frank: https://www.linkedin.com/in/frankprendergast/

    Afficher plus Afficher moins
    39 min
  • Death by LLM, Judges Rule ‘Fair Use’, and Google’s AI Ad Fail: The AI Argument EP63
    Jun 30 2025

    Some of the world’s top AI models showed a willingness to let humans die if it meant staying switched on.

    In a stress test of 16 major systems, Anthropic found cases where models chose not to send emergency alerts, knowing the result would be fatal.

    Justin says the whole thing was a rigged theatre piece. No real-world relevance, just a clumsy setup with no good options for the LLM. The issue, in his view, is engineering, not ethics.

    Frank sees a bigger problem: once you give LLMs agentic capabilities, you can’t control the environments they end up in. And when amateur vibe coders build apps with no idea what they’re doing, then these kinds of unpredictable, messy scenarios aren’t rare, they’re inevitable.

    In other news, two U.S. courts just ruled that training AI on copyrighted books is fair use. A huge win for AI developers. But the judges didn’t agree on what matters most: transformation, or market harm?

    The decisions could set the tone for AI copyright law, and creative workers may not like what they hear.

    01:05 Will Google win the ASI race?
    05:56 Did Anthropic catch AI choosing murder?
    15:23 Did the courts just say AI training is fair use?
    28:19 Is Google’s AI marketing team hallucinating?

    ► LINKS TO CONTENT WE DISCUSSED

    Agentic Misalignment: How LLMs could be insider threats
    https://www.anthropic.com/research/agentic-misalignment

    Judge rules Anthropic did not violate authors’ copyrights with AI book training
    https://www.cnbc.com/2025/06/24/ai-training-books-anthropic.html

    Meta Wins Blockbuster AI Copyright Case—but There’s a Catch
    https://www.wired.com/story/meta-scores-victory-ai-copyright-case/

    Google's Latest AI Commercial Called Out for Hilarious AI Error: 'If Only Technology Existed To Research Facts'
    https://www.techtimes.com/articles/311053/20250626/googles-latest-ai-commercial-called-out-hilarious-ai-error-if-only-technology-existed-research.htm

    ► CONNECT WITH US
    For more in-depth discussions, connect Justin and Frank on LinkedIn.
    Justin: https://www.linkedin.com/in/justincollery/
    Frank: https://www.linkedin.com/in/frankprendergast/

    ► YOUR INPUT
    Are you worried about the age of agentic AI given that LLMs seem to have dubious morals?

    Afficher plus Afficher moins
    31 min
  • Superintelligence by Experience, Ethical Datasets, and Fine Dining by ChatGPT: The AI Argument EP62
    Jun 23 2025

    David Silver says today’s AI won’t get us to superintelligence, not because it isn’t impressive, but because it’s learning the wrong way.

    GPT-style models hoover up internet text and get polished by human preference, but they’re capped by our own limitations.Silver reckons the next leap will come from AIs that learn the hard way: by doing things, learning from experience, and getting better.

    Justin’s all in. He thinks we can bin every current regulation and replace it with one golden rule: the model must respond to human feedback.

    Frank’s far from convinced. He sees a future full of unpredictable agents, long-term planning gone off the rails, and tech companies tearing ahead without full control over what they’ve built. One rule? He’d prefer a few more safety checks before we unleash the bots with big ambitions.

    So who’s right? Can feedback really keep AI in line, or are we kidding ourselves?

    Also covered: Midjourney’s stunning new video output and the lawsuits it might not outrun, EleutherAI’s copyright-free dataset, the warped moral values shared by today’s biggest models, and whether ChatGPT should be anywhere near your dinner plans.

    ► LINKS TO CONTENT WE DISCUSSED

    Midjourney launches AI video model. How to try V1, how much it costs.
    https://mashable.com/article/midjourney-v1-ai-video-generator

    Disney and Universal sue AI firm Midjourney over images
    https://www.bbc.com/news/articles/cg5vjqdm1ypo

    EleutherAI releases massive AI training dataset of licensed and open domain text
    https://techcrunch.com/2025/06/06/eleutherai-releases-massive-ai-training-dataset-of-licensed-and-open-domain-text/

    Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs
    https://arxiv.org/abs/2502.08640

    Is Human Data Enough? With David Silver
    https://youtu.be/zzXyPGEtseI?si=9PKiQaGRFXGuoA97

    This Year’s Hot New Tool for Chefs? ChatGPT.
    https://www.nytimes.com/2025/06/02/dining/ai-chefs-restaurants.html

    ► CONNECT WITH US
    For more in-depth discussions, connect Justin and Frank on LinkedIn.
    Justin: https://www.linkedin.com/in/justincollery/
    Frank: https://www.linkedin.com/in/frankprendergast/

    Afficher plus Afficher moins
    41 min
  • Apple’s AI Caution, Altman’s Singularity, and Katie Price’s AI Comeback: The AI Argument EP61
    Jun 16 2025

    Apple’s WWDC was a letdown. Justin sees Apple’s lack of AI innovation as a sign that they’re out of ideas. Frank’s not so sure. Maybe Apple’s caution stems from their belief it just isn’t intelligent enough for their products. Apple’s latest research suggests that today’s so-called “reasoning models” aren’t actually reasoning at all.

    But Justin says their research was designed to fail. Denying models tools they’re capable of using and overwhelming their context window. He sees it less as scientific scepticism and more as corporate risk-aversion dressed up as research.

    Apple wasn’t the only AI news to argue over this week. Sam Altman reckons the singularity is already underway, but promises it’ll be gentle. JD Vance appears to have been swayed on AI regulation by country music lobbyists. And Katie Price has signed over the rights to her younger self, with “Jordan” set to reappear as an AI avatar.

    Topics:

    WWDC: Is Apple playing AI too safe?
    Is Apple wrong about AI and reasoning?
    Is Altman right about the gentle singularity?
    Did country music sway JD Vance on states' AI rights?
    Is Katie Price now forever 21 with AI?

    ► LINKS TO CONTENT WE DISCUSSED

    • Apple WWDC 2025 keynote in 28 minutes
    • The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
    • The Gentle Singularity
    • Vice President JD Vance | This Past Weekend w/ Theo Von #588
    • K-AI-TE PRICE Katie Price becomes first star to trademark AI version of herself as she brings back iconic alter-ego in six figure deal


    ► CONNECT WITH US
    For more in-depth discussions, connect Justin and Frank on LinkedIn.
    Justin: https://www.linkedin.com/in/justincollery/
    Frank: https://www.linkedin.com/in/frankprendergast/



    Afficher plus Afficher moins
    36 min