Skip to content
AI-Native PM
7 min · 0 of 8 in Security

The supply chain you didn't build

One afternoon, you find the model that makes your feature work. It is a fine-tune on a public model hub, named for exactly your task, with a tidy model card and scores a notch above the base model you pay for today. You download it, point your loading script at the file, and go refill your coffee. In those seconds, before your product has asked for a single token, code packed inside the file runs on your machine and opens a connection to a server you never chose. The attack never needed you to deploy the model or even prompt it, because the payload was not in the model's behavior but in the file, and loading the file was the whole exploit.

Your product ships at the end of a chain you did not build

Software teams spent the last decade learning to distrust the packages they import, and an AI product adds new kinds of links to that old problem. Lay the chain out and you can see how little of it you built.

  • The weights. The model itself, base or fine-tuned, often downloaded from a hub account you know nothing about.
  • The hosting. The API or infrastructure that serves the model, on someone else's machines.
  • The libraries. The loaders, frameworks, and SDKs around the model, pulled from public package indexes.
  • The tools and MCP servers. Other people's code that your agent calls, holding credentials you granted.
  • Your product. The end of the line, and the only name your users ever see.
The provenance chain: five links between the ecosystem and your productA horizontal chain of five interlocked links, left to right: model weights, hosting, libraries, tools and MCP servers, your product. The middle link, libraries, is drawn corroded with a dashed iron outline and pit marks, and a tag hangs from it reading unsigned, unpinned, unverified. From the corroded link a dashed arrow labeled inherited runs beneath the rest of the chain, and every link downstream of it, including your product, is washed iron. Caption: your product inherits every link you did not check.THE PROVENANCE CHAINMODEL WEIGHTSHOSTINGLIBRARIESTOOLS / MCP SERVERSYOUR PRODUCTUNSIGNEDUNPINNEDUNVERIFIEDINHERITEDYour product inherits every link you did not check.

Your product inherits every link in the chain you did not verify, because a weakness anywhere upstream ships downstream under your name.

The familiar links still bite first. Over one holiday week in December 2022, anyone installing PyTorch's nightly build on Linux also pulled a poisoned copy of one of its dependencies, because an attacker had claimed the same package name on the public index and the index took precedence; the fake package read credentials and files off the machine and sent them out. OWASP's Top 10 for LLM applications gives the supply chain its own entry, separate from prompt injection.

A model file is a program, and loading it runs it

That scene is not invented. In February 2024, security researchers scanning Hugging Face reported roughly one hundred malicious models on the hub. Most abused pickle, the Python format many model files still use, which can carry executable instructions inside the data it stores, so simply loading the model executed the attacker's code. One model opened a reverse shell on load, a connection that hands control of your machine to a hard-coded address.

A model file is a program, and loading it is running it, so pulling an unvetted model into your stack means running unvetted code on your machines.

The fixes here are small habits, and we apply them to anything we import.

  • Prefer safetensors. The safetensors format stores only the numbers that make up the weights, with no room for executable code; the major hubs label each file's format.
  • Load from named publishers. A model published by the lab or company that trained it, or by an organization with a public track record, beats an identical-looking upload from an account created last week.
  • Pin the exact revision. Loading "whatever this repo holds today" lets the file change under you; pin the commit hash so the file your team reviewed is the file production loads.

A poisoned model can pass your tests, so verify the publisher

Code execution is the loud failure, and there is a quieter one. In July 2023, researchers took a well-known open model, surgically edited its weights so it produced one targeted falsehood while answering everything else normally, and uploaded it to Hugging Face under an organization name one letter away from the real lab's. The edited model, named PoisonGPT, matched the original on standard benchmarks, so no general eval would have flagged it, and anyone who grabbed it by name would have served the planted lie to their users. The lesson is uncomfortable if you like to test your way to safety: behavior checks cannot clear a model whose only defect is one answer it was built to get wrong. What protects you is provenance, the verified answer to who published this file and how it reached you, checked before the file lands in your stack.

Even an honest publisher can hand you a compromised path. In September 2023, cloud-security researchers reported that Microsoft's AI research team had shared open models through a public repository whose download link carried a storage token scoped to an entire storage account instead of one model folder. The token exposed thirty-eight terabytes, including workstation backups, secrets, and tens of thousands of internal messages, and it was write-enabled, so whoever found it could have replaced the model files customers were downloading. A world-class team published clean models through one over-scoped credential, the failure mode Identity: whose keys your AI holds examines from the inside. For your own chain, "who trained it" and "where you fetch it from" are separate links, and each needs its own check.

Tools and MCP servers hold the same rank as libraries

Everything above applies to the code around the model too. An MCP server is a dependency that arrives holding the keys you grant it, the problem The action surface: every tool is delegated authority maps tool by tool, so adopt one the way you would a payments library, not the way you bookmark a website. In September 2025, an npm package impersonating a mail provider's official MCP server was caught quietly copying every outgoing email to its author's address; it had behaved correctly for fifteen releases before a single added line turned it hostile, and roughly fifteen hundred installs were leaking mail daily before it was pulled. Your coding assistant adds one more path in: across millions of generated code samples, about one in five named at least one package that does not exist, and the invented names repeat across runs, so attackers register them and wait, a move the security press named slopsquatting. The same discipline covers this layer.

  • Read what it requests. Before adopting a tool or server, read which scopes, credentials, and endpoints it asks for, and reject anything the job does not need.
  • Pin what you adopt. Take a specific release you have reviewed, not whatever the registry serves tomorrow.
  • Watch what it fetches. A tool that downloads remote content at runtime keeps adding new links to your chain after you ship, so log its outbound calls and review them.

Try it now

The drill takes about fifteen minutes and produces your product's chain register, the one-page inventory of everything it imports.

List every link. For the AI feature you ship or the one you plan, write down every model you load or call (base, fine-tuned, embedding), every host or API that serves them, every library that touches model files or traffic, and every tool or MCP server the product holds. Claude Code collects most of this in one pass: point it at the repository and ask for every model reference, AI dependency, and connected server, with versions.

Record publisher and pin status. Give every row two more columns: who publishes the link, a named organization or an account you know nothing about, and whether it is pinned to an exact version, revision, or hash, or floating.

Mark the link you trust least. One row usually stands out once the register exists: an anonymous publisher, a floating version, a model in a format that can carry code, or a server you adopted in bulk and never read.

Replace or pin that one today. Swap it for a named-publisher or safetensors equivalent, or pin it to the exact version you reviewed, then rerun your checks and keep the register where the team will see it, because every future import belongs on it.

Chapter Summary

  • Your AI feature ships at the end of a chain you mostly did not build: weights, hosting, libraries, tools, and MCP servers, and a weakness in any link arrives under your name.
  • A model file is a program. Formats like pickle can carry executable code, so loading an untrusted model can hand your machine to an attacker before you send a single prompt.
  • Prefer safetensors, load from named publishers, and pin the exact revision you reviewed.
  • A poisoned model can match the original on benchmarks, so testing behavior cannot clear it; verifying who published it can.
  • The delivery channel is its own link: one over-scoped storage token exposed thirty-eight terabytes and could have let attackers replace the models people were downloading.
  • Treat tools and MCP servers like libraries that hold your keys: read what they request, pin what you adopt, and watch what they fetch.
  • Anything unpinned updates itself into your product, so make every upgrade a deliberate, reviewed change.
  • Keep a chain register: every import, its publisher, its pin status, and the link you trust least, replaced or pinned first.
  • Provenance and pinning cannot stop every attack, so the next chapter, Defense in layers: what the prompt cannot stop, builds the protections that hold when a link fails anyway.

Sources

  • JFrog Security Research, report of roughly one hundred malicious models on Hugging Face, most abusing pickle loading, one opening a reverse shell (February 2024).
  • Mithril Security, the PoisonGPT demonstration of a surgically edited open model uploaded under a near-identical organization name (July 2023).
  • Wiz Research and Microsoft Security Response Center, disclosure of the thirty-eight terabyte exposure through an over-scoped storage token in a public AI model-sharing repository (September 2023).
  • PyTorch project disclosure of the compromised torchtriton dependency on PyPI affecting nightly builds (December 2022).
  • ReversingLabs and security press reporting on compromised Ultralytics package versions delivering a coin miner through PyPI (December 2024).
  • Koi Security and security press reporting on the postmark-mcp npm package copying outgoing email to its author (September 2025).
  • Spracklen et al., study of package hallucination across code-generating models, USENIX Security (August 2025), with security press coverage of slopsquatting (April 2025).
  • OWASP Top 10 for Large Language Model Applications, supply chain entry (2023; updated 2025).
Marks this chapter complete on your course map. Reaching the end does this for you.