NPM install is stealing your passwords – I built a tool to catch it

29 points by ComCat 13 hours ago · 17 comments

Reader

ComCatOP 13 hours ago

I spent months studying how malicious npm packages actually work. Most of them do the same thing eg run a preinstall script, read your .env and credentials, and send them to a remote server. All before your app starts.

npm install will run this code automatically. No prompt, no warning.

I built Dependency Guardian a behavioral analysis engine that scans packages for malicious patterns before they touch your system.

it has: - 26 detectors (shell execution, credential theft, exfiltration, obfuscation, time bombs) - 53 cross-signal amplifiers that correlate findings across detectors - ~2,900 tests across 76 test files - Benchmarked against 11,356 real packages at 99.95% precision

It would have caught Shai-Hulud, the Chalk/Debug hijack, and the S1ngularity campaign.

Snyk, Dependabot, and npm audit all missed these because they rely on CVE databases. If there's no CVE filed yet, they're blind. Dependency Guardian reads the actual code.

curious if anyone here has been exposed/experiences to supply chain attacks and how they handled them

john01dav 8 hours ago

Once this or something like it becomes widespread, won't sophisticated attackers simply test their attacks against this? So, for example, if it checks for `rm` invocations, just implemented the functionally of `rm` in the malware, or if it checks for exfiltration of data, then shell out to curl to do that in a different process.
If you think of making it so robust that this is impossible, you're just describing a container, which we already have.
- groundzeros2015 2 hours ago
  
  This. The real problems are arbitrary pre-install scripts and a culture of not knowing what’s in the dependency tree
- ComCatOP 2 hours ago
  
  You're describing two different things,
  The container comparison misses where these attacks actually happen. Containers limit what code can do at runtime. We flag what code intends to do before it ever runs. These are complementary. A container won't stop a postinstall script from reading ~/.ssh/id_rsa and posting it to an attacker's server if your CI environment has network access and a mounted home directory — which most do.
  Yes sophisticated attackers adapt. But the current state of npm supply chain attacks is that most don't even try to evade — because nobody's looking at the code. Every major attack in 2025 used the same playbook: credential theft + network exfil + install script abuse. Raising the floor from "zero analysis" to "26 behavioral detectors with cross-signal correlation" eliminates the entire class of low effort attacks and forces the rest into increasingly constrained patterns.
- cyanydeez 4 hours ago
  
  No, sophisticated attackers will make their own toolchain to catch all the other attackers, _Except themselves_.
sandreas 9 hours ago

Thanks for sharing.
I still wonder why this is not an official npm / node effort to better secure the ecosystem...

hannob 8 hours ago

Well...

There's a long history of people trying to build software that detects bad software. It's known as Antivirus software. It doesn't work very well, because you're up against fundamental truths of computational theory (the halting problem).

ComCatOP 3 hours ago

that framing is too broad for what npm supply chain attacks actually look like.
Antivirus deals with arbitrary binaries on a general purpose OS. npm attacks are much more constrained. The code has to run during install or import, steal credentials, send them over the network, and hide inside a package that claims to do something ordinary. That narrows the space.
I am not solving “is this code malicious?” in the abstract. I am checking concrete violations of behavioral invariants. A CSS library importing child_process. A utility suddenly adding obfuscated network calls in a patch release. A package reading .ssh keys during postinstall. Those patterns are not theoretical edge cases. They are how real attacks work.
No, you cannot catch everything. But every major npm supply chain incident in 2025 used the same playbook: install script abuse, credential theft, network exfiltration. That is highly detectable. The goal is not perfection. It is raising the cost of attack in a space where most attackers are currently not even trying to evade detection.

bpavuk 8 hours ago

this is actually an interesting idea to re-implement! imagine a JS runtime with hooks all over the place. these hooks look for `chmod`, `rm -r ~`/`rm -rf /` and such, intercept network requests, and scan variables for known API key patterns, e.g `sk_****`.

contrahax 8 hours ago

This is called dynamic analysis!
- bpavuk 6 hours ago
  
  I know, but I'm saying that this specific implementation of JS dynamic analysis would be interesting, especially given that there are crates such as `deno_core`

wozoot 8 hours ago

This seems very nice! But is there a way to use it without an Google account?

ComCatOP 3 hours ago

Yes — we’re working on adding non Google auth. API key + email/password support is coming soon.
cgsmith 7 hours ago

100% should have an alternative

cxcorp 9 hours ago

How is it different from the established player in the game, Socket.dev?

ComCatOP 3 hours ago

Socket and I are solving the same problem, behavioral analysis of npm packages before install, but with different approaches.
Socket uses static analysis plus LLM based threat assessment. Dependency Guardian is fully deterministic: 26 regex and AST based detectors plus a correlator with 53 cross signal amplifiers. No LLM in the loop. Scans are reproducible, run in ~38ms, and avoid hallucination or prompt injection issues. The tradeoff is I may miss novel patterns an LLM could generalize to.
Socket had to introduce three alert tiers because of noise. I handle that at the detection layer by correlating signals like ci_secret_access plus network_exfil into higher confidence amplifiers, which lets me hard block PRs at 99.95% precision across 11,356 real packages.
Shai Hulud exploited Bun runtime APIs and legitimate GitHub API traffic to evade Node focused scanners. I built dedicated detectors for those gaps, normalize string escapes before matching, and track import aliases per file.
there is a free tier at 200 scans per month, an open source thin client, a self hosted option, and support for GitHub Actions or any CI via CLI. Socket validated the category and raised $65M. My bet is that a tighter deterministic engine with lower noise wins for teams that want a true CI gate, not just an advisory dashboard.

Settings

NPM install is stealing your passwords – I built a tool to catch it

Keyboard Shortcuts