scarydeps alpha

a bot to comment on lockfile changes and summarize the "permissions" changes in updated dependencies

Alpha-Quality Software Ahead

Things may not work right, or at all. You have been warned!

What's Scary About Dependencies?

You probably blindly trust thousands of dependencies but never read their source. Many modern frameworks have hundreds to thousands of dependencies installed before you even start writing code. In theory, you should carefully investigate the source code for each dependency both when you add and every time you update.

It gets worse. Even if you read the source code for a dependency and decide it is ok, what if someone changes the source code and re-publishes it under the same version? This is what lockfiles like yarn.lock or Pipfile.lock fix: they specify a list of accepted cryptographic hashes for every dependency. So if you depend on a package that can be installed on Windows and Linux, it will have two hashes: one for the Windows package, and one for the Linux package. When you install that dependency, the hashes will be checked and it will only install if the downloaded file matches the specified hash.

Lockfiles only help you know something changed, not whether it matters. If you look at the diff for your lockfile update, you will see hundreds of hashes changing. What you should do is go through and read the source code for every updated dependency and make sure that there's no new vulnerability or malicious behavior. But who has time for that?

What Scarydeps Does

Who has the time to check the hashes and read the source code of every distributed file? Scarydeps fetches and checks every hash in your lockfile. Then it runs a static analysis tool (Semgrep) over each package and uses static analysis to find "permissions". Permissions are "in the worst case, what could this package do?" (see below for list). This permissions are designed to be easy to reason about: the is-odd package, which tells us if a number is odd (why is that a package?) shouldn't be talking to the network, shouldn't be executing arbitrary code, shouldn't be reading or writing from the file system.

Imagine your Javascript project depends on the is-odd package. Like the paranoid developer you are, you dutifully download the package and read the source code:

Ok, seems fine. Months later, you are updating your dependencies... yarn upgrade ...

Are you going to go fetch the file and read it again? Ok, here's what you'd see when you update with scarydeps (theoretical example, is-odd has not been hacked, it is just conveniently small for illustrative purposes)

So scarydeps comments on any PR that has lockfile (e.g., yarn.lock) changes. It uses Semgrep to summarize the “permissions” in the modules and whether they changed. Permissions are coarse but roughly fall into:

In the (fake) is-odd example, we detected the usage of child_process.exec creating a reverse shell at the behest of some unknown malicious entity.

Have There Actually Been Malicious Dependencies?


How Should I Use It?

By far the most interesting transitions are those from 0 to 1 or more permission. For instance:

In this case, you should probably take a look at the source package (or click those reference links) to make sure nothing weird is going on, like a reverse shell being created.

In a sampled assessment on 938,381 NPM Javascript packages looking at 0-to-nonzero permission changes for dynamic code execution, this tool had a 4.6% success rate with finding new or previously reported malicious/vulnerable packages. So best case, 19 out of 20 times the comment will lead you to a “this is fine” conclusion in exchange for “hmm, malicious”? the other 1 time.

What Package Managers Are Supported?

Only NPM or Pypi lockfiles are supported. Packages are always looked up by hash value. All possible hash values defined in the lockfile will be checked. SHA{1,256,512} are accepted. If any errors occur during the analysis, they will be noted in output.

Does Anyone Besides You Think This Is a Good Idea?

I like this, a lot! I use dependebot on some projects, but it doesn’t really do anything to check against supply-chain attacks, and I’m convinced it’s only a matter of time before we see a really high-profile supply chain attack against open source. - Jacob Kaplan-Moss, Django co-creator

Threats to Correctness