Domesticating AI

Épisodes

Hacking AI: Why Most AI Systems Are Insecure by Default

Apr 24 2026
Hosts: Miriah Peterson, Matt Sharp, Chris Brousseau
Recorded: April 2026
Status: Released
Most AI systems today are designed to be helpful — not secure.
In this episode, we break down how AI systems actually get exploited in production:
a real supply chain attack on a widely used AI dependency
prompt injection and why it still works
image-based (multimodal) exploits
tool and agent abuse
If you’re building AI — especially at a startup — you are the security team.
A widely used AI dependency was compromised via a malicious .pth file:
executes automatically when Python starts
no import required
targets credentials, SSH keys, and environment variables
👉 Just installing the package was enough.
This highlights a critical reality:
Your AI system is only as secure as your dependencies.
Models cannot distinguish between instructions and data
External content can override system behavior
Still one of the most common AI vulnerabilities
🔗 https://learnprompting.org/docs/prompt_hacking/injection
Hidden instructions embedded in images
AI interprets images differently than humans
Expands the attack surface significantly
🔗 https://arxiv.org/abs/2306.11698
AI systems can take real-world actions via tools
Prompt injection → API calls, data leaks, unintended execution
Agents amplify risk through autonomy and retries
If you’re building AI systems today:
separate instructions from data
limit tool permissions
treat outputs as untrusted
validate everything before execution
AI systems have an internet-sized attack surface
Supply chain attacks bypass all AI safeguards
Prompt injection is a fundamental problem
AI doesn’t fail safely — it fails wherever your system is weakest
LiteLLM incident: https://github.com/BerriAI/litellm/issues/24512
Attack breakdown: https://futuresearch.ai/blog/litellm-pypi-supply-chain-attack/
LLM attack techniques: https://llm-attacks.org/
OWASP LLM Top 10: https://owasp.org/www-project-top-10-for-large-language-model-applications/
Gandalf challenge: https://gandalf.lakera.ai/
We’ve launched a Patreon for Domesticating AI 🎉
Get:
early access to episodes
behind-the-scenes content
bloopers and uncut moments
👉 https://patreon.com/DomesticatingAIPodcast
🎥 YouTube: https://youtu.be/HTTxE7Y1sko
What’s the weirdest way an AI system has broken for you?
Keep your AI on a leash.
Afficher plus Afficher moins
43 min

Impossible d'ajouter des articles

Désolé, nous ne sommes pas en mesure d'ajouter l'article car votre panier est déjà plein.

Veuillez réessayer plus tard

Veuillez réessayer plus tard

Échec de l’élimination de la liste d'envies.

Veuillez réessayer plus tard

Impossible de suivre le podcast

Impossible de ne plus suivre le podcast

Écouter gratuitement
Coding with AI: Vibe Coding vs Real Engineering (with Tyler Folkman)

Apr 10 2026
AI can write code — but that doesn’t mean you should trust it.
In this episode of Domesticating AI, we’re joined by Tyler Folkman (author of The AI Architect) to break down how engineers are actually using AI to build software — and why most people are still just vibe coding.
Vibe coding vs real engineering
Reasoning models vs coding models
How to plan and prompt AI effectively
When to let AI take the wheel (and when not to)
Local vs cloud coding agents
Token costs vs owning hardware
Tyler Folkman — The AI Architect
Anthropic
https://www.anthropic.com
OpenAI
https://openai.com
Ollama
https://ollama.com
MiniMax-M2.5
https://ollama.com/library/minimax-m2.5
GLM-5
https://ollama.com/library/glm-5
AmpCode Chronicle
https://ampcode.com/chronicle
Andrej Karpathy on Context Engineering
https://x.com/karpathy
“Human in the Loop is Tired”
(add link if you have it)
Domesticating AI is a bi-weekly podcast about practical AI for developers.
We help you brace the feral open-source AI landscape — so you can tame it instead of getting dragged by it.
contact@domesticatingai.com
Spotify
https://open.spotify.com/show/2WsAR4fvcXzp3vVZGVlkE2
Apple Podcasts
https://podcasts.apple.com/us/podcast/domesticating-ai/id1873338950
Are you vibe coding — or engineering with AI?
Let us know your setup.
Keep your AI on a leash.
🧠 What We Cover🔗 Links & ResourcesGuestModels & ToolsArticles / Mentions🎧 About the Podcast📬 Contact🔥 Follow👇 Join the Discussion
Afficher plus Afficher moins
40 min

Impossible d'ajouter des articles

Désolé, nous ne sommes pas en mesure d'ajouter l'article car votre panier est déjà plein.

Veuillez réessayer plus tard

Veuillez réessayer plus tard

Échec de l’élimination de la liste d'envies.

Veuillez réessayer plus tard

Impossible de suivre le podcast

Impossible de ne plus suivre le podcast

Écouter gratuitement
Securing Your Homelab: AI Infrastructure, Access Control & Why Docker Isn’t Isolation

Mar 27 2026

Recording Date: February 27, 2026Hosts: Miriah Peterson, Matt Sharp, Chris BrousseauRunning AI locally is easier than ever.Running it securely is another story.In this episode of Domesticating AI, we break down the moment every homelab builder hits:The second you move from one machine to two machines…access becomes your first real engineering problem.We explore the real architecture questions behind self-hosting AI:Why a dedicated machine isn’t a sandboxWhy Docker alone isn’t isolationHow homelabs evolve from Plex servers to AI infrastructureThe blast radius problem with local agentsWhy networking and access control matter more than model sizeWe also discuss the surge in local AI hardware demand and the risks of running powerful agents on machines with unrestricted access.Whether you're running OpenClaw, Ollama, a NAS, Postgres, or a home automation stack, the same rule applies:Infrastructure without containment is just risk waiting to happen.High-memory Mac Minis are seeing long shipping delays as developers rush to build local AI systems.https://www.tomshardware.com/tech-industry/artificial-intelligence/openclaw-fueled-ordering-frenzy-creates-apple-mac-shortage-delivery-for-high-unified-memory-units-now-ranges-from-6-days-to-6-weeksMarketplace plugins and execution boundaries are becoming a growing security concern in agent systems.https://www.linkedin.com/posts/matthewsharp_i-use-to-do-nothing-but-post-about-clean-activity-7432832983339999232-iR04Overview of risks around agent plugin ecosystems and execution boundaries.https://conscia.com/blog/the-openclaw-security-crisis/Private mesh networking used to securely access homelabs.https://tailscale.comLocal AI coding agent framework.https://openclaw.aiLocal LLM runtime used for running models on personal machines.https://ollama.comWhy people actually build homelabsPlex, NAS, and home automation as infrastructure entry pointsAI workloads vs dev workloadsWhy long-running services shouldn’t live on your laptopNetworking architecture for homelabsRBAC-style access control between machinesSecrets management mistakes developers makeContainment and blast-radius thinking for AI agentsTailscale and private mesh networkingEach host answers:If I had $0What I would runWhat I would avoidIf I had $1KWhat machine I’d buyHow I’d isolate workloadsIf I had $5KHow I’d segment infrastructureWhat monitoring I’d deployWhat I would never expose to the internetStaff Data Engineer, content creator, and founder of SoyPete Tech.Miriah focuses on practical AI systems, Go infrastructure, and self-hosted AI engineering.She is also a Google Developer Expert in Go and organizer of Go West Conf.https://soypete.techAI engineer and co-author of LLMs in Production.Matt focuses on applied AI systems, local model infrastructure, and developer-focused AI tooling.Software engineer and AI practitioner focused on practical applications of machine learning and developer infrastructure.Domesticating AI is supported by the SoyPete Tech community.If you enjoy the show:Subscribe on YouTubeFollow on SpotifyJoin the Discord communityShare the episode with another engineer building with AIMore content and tutorials:https://soypetech.substack.com📰 News DiscussedMac Mini Shortages from Local AI DemandOpenClaw Security DiscussionOpenClaw Security Concerns (Referenced)🧰 Tools & Technologies MentionedTailscaleOpenClawOllama🏗 Topics Covered⚡ Lightning Round🎙 HostsMiriah PetersonMatt SharpChris Brousseau🤝 Sponsors
Afficher plus Afficher moins

30 min

Impossible d'ajouter des articles

Désolé, nous ne sommes pas en mesure d'ajouter l'article car votre panier est déjà plein.

Veuillez réessayer plus tard

Veuillez réessayer plus tard

Échec de l’élimination de la liste d'envies.

Veuillez réessayer plus tard

Impossible de suivre le podcast

Impossible de ne plus suivre le podcast

Écouter gratuitement
Agents Don’t Need More Compute — They Need Better Engineering

Mar 13 2026
📅 Recorded: February 6, 2026
In this episode of Domesticating AI, we discuss why scaling AI systems with more compute often hides weak engineering decisions — especially in agent workflows. We explore constrained hardware, context management, tool calling, logit manipulation, and why small models can make you a better AI engineer.
Moltbot / Clawdbot overview (The Verge)
https://www.theverge.com/report/869004/moltbot-clawdbot-local-ai-agent
Fake Moltbot VS Code extension spreading malware
https://thehackernews.com/2026/01/fake-moltbot-ai-coding-assistant-on-vs.html
Exposed Moltbot admin panels and credential leaks
https://www.bitdefender.com/en-us/blog/hotforsecurity/moltbot-security-alert-exposed-clawdbot-control-panels-risk-credential-leaks-and-account-takeovers
Cloudflare Moltworker (self-hosted agent on Workers)
https://blog.cloudflare.com/moltworker-self-hosted-ai-agent/
LangGraph – https://www.langchain.com/langgraph
LangChain – https://www.langchain.com/
Langfuse – https://langfuse.com/
Pydantic AI – https://github.com/pydantic/pydantic-ai
Instructor – https://github.com/jxnl/instructor
Hugging Face SmolAgents – https://huggingface.co/blog/smolagents
Have a topic suggestion or want to sponsor the show?
📩 contact@domesticatingai.com
Afficher plus Afficher moins
Indisponible

Impossible d'ajouter des articles

Désolé, nous ne sommes pas en mesure d'ajouter l'article car votre panier est déjà plein.

Veuillez réessayer plus tard

Veuillez réessayer plus tard

Échec de l’élimination de la liste d'envies.

Veuillez réessayer plus tard

Impossible de suivre le podcast

Impossible de ne plus suivre le podcast
Hardware-First Home AI: Chips, Memory, Backends, and What to Buy

Feb 27 2026

Episode 3 is a hardware-first guide to running AI at home. We break down what CPUs vs GPUs vs NPUs vs TPUs actually do in the inference pipeline, why memory capacity isn’t the same as performance (model loading, KV cache, and MoE), why backends/runtimes are real constraints (CUDA vs ROCm vs Metal/MLX vs CPU), and how to scale from one box to multi-GPU and multi-machine setups.

Keep your AI on a leash.

Links mentioned:
- GPU Glossary (Modal): https://modal.com/gpu-glossary
- CUDA → ROCm headline: https://wccftech.com/the-claude-code-has-managed-to-port-nvidia-cuda-backend-to-rocm-in-just-30-minutes/
- Unsloth PR: https://github.com/unslothai/unsloth/pull/3856

Afficher plus Afficher moins

33 min

Impossible d'ajouter des articles

Désolé, nous ne sommes pas en mesure d'ajouter l'article car votre panier est déjà plein.

Veuillez réessayer plus tard

Veuillez réessayer plus tard

Échec de l’élimination de la liste d'envies.

Veuillez réessayer plus tard

Impossible de suivre le podcast

Impossible de ne plus suivre le podcast

Écouter gratuitement
From “Inference Box” to Dev Rig: What NVIDIA DGX Spark Actually Is | Ep 2

Feb 13 2026

Everyone keeps calling NVIDIA DGX Spark an “inference box”… but in practice it behaves more like a dev rig.In Ep 2 of Domesticating AI, we break down what Spark is actually good for (AI development + fine-tuning) vs what it isn’t (a magical drop-in inference server). We also dig into why unified memory changes the local-AI experience, the “gateway stack” (Ollama + Open WebUI), when you outgrow turnkey UIs, and how homelab economics + networking decisions shape what you should run at home.In this episodeTraining vs inference (and why “inference server” gets misused)Unified memory: what it changes for model loading + workflowsOllama + Open WebUI as the fastest on-ramp for local AIFine-tuning workflows (QLoRA/Unsloth-style) and where Spark shinesHomelab reality: Docker “recipes,” troubleshooting, and collaborationSafer remote access: TailscaleCloud vs home economics (when cloud is cheaper… and when it explodes)NVIDIA / DGX SparkDGX Spark: https://www.nvidia.com/en-us/products/workstations/dgx-spark/Build hub / recipes: https://build.nvidia.com/sparkNIM on Spark playbook: https://build.nvidia.com/spark/nim-llmLocal AI runners + UIsOllama: https://ollama.com/Open WebUI (GitHub): https://github.com/open-webui/open-webuiOpen WebUI docs: https://docs.openwebui.com/llama.cpp: https://github.com/ggml-org/llama.cppLM Studio: https://lmstudio.ai/vLLM: https://github.com/vllm-project/vllmJan: https://jan.ai/Fine-tuning + workflowsUnsloth: https://github.com/unslothai/unslothImage generation tools (mentioned)ComfyUI: https://github.com/Comfy-Org/ComfyUIAUTOMATIC1111 SD WebUI: https://github.com/AUTOMATIC1111/stable-diffusion-webuiNetworking / Remote accessTailscale: https://tailscale.com/Cloud GPU alternatives (mentioned)Runpod pricing: https://www.runpod.io/pricingModal pricing: https://modal.com/pricingMiriah Peterson (Host): Miriah Peterson is a software engineer, Go educator, and community builder focused on production-first AI—treating LLM systems like real software with real users. She runs SoyPete Tech (streams + writing + open-source projects) and stays active in the Utah dev community through meetups and events, with a practical focus on shipping local and cloud AI systems.Connect:SoyPete Tech (YouTube): https://www.youtube.com/@SoyPete_TechSoyPete Tech (Substack): https://soypetetech.substack.com/LinkedIn: https://www.linkedin.com/in/miriah-peterson-35649b5b/Matt Sharp (Host): Matt Sharp is an AI Engineer and Strategist for a tech consulting firm and co-author of LLMs in Production. He’s a recovering data scientist and MLOps expert with 10+ years of experience operationalizing ML systems in production. Matt also teaches a graduate-level MLOps-in-production course at Utah State University as an adjunct professor. You can find him on Substack (Data Pioneer), LinkedIn, and on his other podcast, the Learning Curve.Connect:Data Pioneer (Substack): https://thedatapioneer.substack.com/Chris Brousseau (Host): Chris Brousseau is a linguist by training and an NLP practitioner by trade, with a career spanning linguistically informed NLP, modern LLM systems, and MLOps practices. He’s co-author of LLMs in Production and is currently VP of AI at VEOX. You can find him as IMJONEZZ (two Z’s) on YouTube, GitHub, and on LinkedIn.Connect:YouTube (IMJONEZZ): https://www.youtube.com/channel/UCPtkaw_x97yP4WevW7axk0gLinkedIn: https://www.linkedin.com/in/chris-brousseau/en📘 LLMs in Production (Matt Sharp & Chris Brousseau): https://www.manning.com/books/llms-in-productionLinks & ResourcesHosts
Afficher plus Afficher moins

43 min

Impossible d'ajouter des articles

Désolé, nous ne sommes pas en mesure d'ajouter l'article car votre panier est déjà plein.

Veuillez réessayer plus tard

Veuillez réessayer plus tard

Échec de l’élimination de la liste d'envies.

Veuillez réessayer plus tard

Impossible de suivre le podcast

Impossible de ne plus suivre le podcast

Écouter gratuitement
Your First AI at Home

Jan 30 2026

Domesticating AI — S01E01: Your First AI at HomeHosts: Miriah Peterson, Matt Sharp, Chris BrousseauThis episode is your practical on-ramp to running AI at home: why inference engines matter, what to install first, and how to make “local AI” feel stable instead of fragile. The hosts start with a hardware + market reality check (tinygrad’s tinybox-style “AI server appliance” idea and the ongoing memory/RAM crunch), then break down what an inference engine actually does, how popular runtimes compare (llama.cpp, vLLM, Ollama, TGI), and a sane starter workflow for getting from “downloaded a model” to “usable local AI.”Inference engines are the “runtime”: model loading, tokenization, KV cache/context handling, and the serving layer.Pick your engine based on your goal: tinkering (llama.cpp) vs serving throughput (vLLM/TGI) vs it-just-works packaging (Ollama).You don’t need a brand-new rig to start, but RAM/VRAM constraints will shape everything.Use leaderboards as a hint, then validate with your own small eval prompts that match your workload.If you’re exposing anything beyond your LAN: reverse proxy + TLS + don’t casually open ports.0:00 Intro + host chaos + what the show is1:08 News: tinygrad / “AI server appliance” thinking (tinybox vibes)2:44 News: RAM prices + the memory crunch for builders8:26 Main: building your first AI at home (why now)8:49 What is an inference engine?12:30 Engines compared: llama.cpp vs vLLM vs Ollama vs TGI15:42 Do you need to buy a new computer? (CPU vs GPU realities)25:32 Models for home: fit-to-hardware, quantization, context34:37 Leaderboards vs evals: picking models you can trust44:00 Community + meetups + where to follow45:22 Outro — “Keep your AI on a leash”News / contextTom’s Hardware: TinyBox production + multi-GPU appliance concept (Tom's Hardware)Reuters: AI-driven memory shortage / supply-chain crunch (Reuters)IDC: 2026 device impacts from the memory shortage (IDC)Inference enginesllama.cpp (GGML org) (GitHub)vLLM OpenAI-compatible server (docs.vllm.ai)Ollama docs (quickstart) (Ollama Documentation)Hugging Face Text Generation Inference (TGI) (GitHub)Miriah Peterson: Software engineer, Go educator, and community builder focused on production-first AI. Runs SoyPete Tech (streams + writing + open-source).Matt Sharp: AI Engineer/Strategist, co-author of LLMs in Production, MLOps practitioner. Writes The Data Pioneer. (thedatapioneer.substack.com)Chris Brousseau: NLP practitioner, co-author of LLMs in Production, VP of AI at VEOX. You can find him as IMJONEZZ. (veox.ai)SoyPete Tech (YouTube): (youtube.com)SoyPete Tech (Substack): (soypetetech.substack.com)Matt’s Substack (The Data Pioneer): (thedatapioneer.substack.com)Chris on YouTube (IMJONEZZ): (youtube.com)LLMs in Production (book): (Manning Publications)
Afficher plus Afficher moins

42 min

Impossible d'ajouter des articles

Désolé, nous ne sommes pas en mesure d'ajouter l'article car votre panier est déjà plein.

Veuillez réessayer plus tard

Veuillez réessayer plus tard

Échec de l’élimination de la liste d'envies.

Veuillez réessayer plus tard

Impossible de suivre le podcast

Impossible de ne plus suivre le podcast

Écouter gratuitement

Épisodes

Hacking AI: Why Most AI Systems Are Insecure by Default

Impossible d'ajouter des articles

Échec de l’élimination de la liste d'envies.

Impossible de suivre le podcast

Impossible de ne plus suivre le podcast

Coding with AI: Vibe Coding vs Real Engineering (with Tyler Folkman)

Impossible d'ajouter des articles

Échec de l’élimination de la liste d'envies.

Impossible de suivre le podcast

Impossible de ne plus suivre le podcast

Securing Your Homelab: AI Infrastructure, Access Control & Why Docker Isn’t Isolation

Impossible d'ajouter des articles

Échec de l’élimination de la liste d'envies.

Impossible de suivre le podcast

Impossible de ne plus suivre le podcast

Agents Don’t Need More Compute — They Need Better Engineering

Impossible d'ajouter des articles

Échec de l’élimination de la liste d'envies.

Impossible de suivre le podcast

Impossible de ne plus suivre le podcast

Hardware-First Home AI: Chips, Memory, Backends, and What to Buy

Impossible d'ajouter des articles

Échec de l’élimination de la liste d'envies.

Impossible de suivre le podcast

Impossible de ne plus suivre le podcast

From “Inference Box” to Dev Rig: What NVIDIA DGX Spark Actually Is | Ep 2

Impossible d'ajouter des articles

Échec de l’élimination de la liste d'envies.

Impossible de suivre le podcast

Impossible de ne plus suivre le podcast

Your First AI at Home

Impossible d'ajouter des articles

Échec de l’élimination de la liste d'envies.

Impossible de suivre le podcast

Impossible de ne plus suivre le podcast