Posted May 31, 2017 by Sebastian McKenzie
One of the claims that Yarn makes is that it makes your package management “deterministic”. But what exactly does this mean? This blog post highlights how both Yarn and npm 5 are deterministic, but differ in the exact guarantees they provide and the tradeoffs they have chosen.
What is determinism?
Determinism in the context of JavaScript package management is defined as always getting the exact same node_modules
folder given a package.json
and companion lock file.
What factors does Yarn’s determinism guarantee?
Lockfile
Yarn is fully deterministic as long as all your teammates are using the same Yarn version. In both Yarn and npm 5, the determinism is ensured by lockfiles that contain information about the whole tree. However the lockfile formats are different between these two projects. We haven’t publicly talked about why we chose this format, so we want to walk you through it:
If you’ve ever seen a yarn.lock
then you should be pretty familiar with the following structure:
has-flag@^1.0.0:
version "1.0.0"
resolved "https://registry.yarnpkg.com/has-flag/-/has-flag-1.0.0.tgz#9d9e793165ce017a00f00418c43f942a7b1d11fa"
supports-color@^3.2.3:
version "3.2.3"
resolved "https://registry.yarnpkg.com/supports-color/-/supports-color-3.2.3.tgz#65ac0504b3954171d8a64946b2ae3cbb8a5f54f6"
dependencies:
has-flag "^1.0.0"
This is the yarn.lock
file generated by running the command yarn add supports-color
. This file contains the version of supports-color that our project is using as well as the exact version of all its sub-dependencies.
With this lock file we can ensure that the version of has-flag
that supports-color
relies on is always the same version.
But there’s one key piece of information that the yarn lockfile doesn’t contain and that’s the hoisting and location of each dependency in the tree. For example, given a yarn.lock
it’s impossible to determine what are the top level dependencies unless you have it’s accompanying package.json
. Even knowing the top level packages, we still cannot infer what hoisting position packages should be in.
In practice this means that the position of packages in node_modules
is computed internally in Yarn, which causes Yarn to be non-deterministic between people using different versions.
The reason that we do this is that this lockfile format is great for diffing. That is, changes to the lockfile can easily be human reviewed. The reason we use a custom format instead of JSON and have everything at the top level is so that it’s easy to read and review. Merge conflicts are usually automatically handled by version control and it reduces thrash.
Hoisting guarantees
Even though Yarn hoisting may differ between versions we still make very strong guarantees around hoisting when the same version of Yarn is used. The most significant of these guarantees is that omitting environmental dependencies like optionalDependencies
and devDependencies
still influences the position of normal dependencies
.
Woah there, that’s a lot of dependency mumbo jumbo. What does that actually mean?
There are several types of dependencies that you can declare in your package.json
. Two of these are plain dependencies
which are installed all the time, and there are devDependencies
which are only installed when you run npm install
or yarn
within the directory where the package.json
file is present.
Due to these features it’s possible to have different layouts of node_modules
with omitted dependencies. But Yarn still factors all dependencies into account when determining the position that they should be at in node_modules
, so even if they aren’t installed, they still influence the hoisting position of others. This is important as having variance in hoisting position of packages in production and development can cause really weird obscure bugs.
NOTE: This guarantee isn’t unique to Yarn and npm 5 also does this.
How does this compare to npm 5?
npm 5 introduces a rework of the shrinkwrap feature called package-lock. This file includes all the information required to create node_modules
as well as all hoisting information. The npm version of the previous yarn.lock
would be:
{
"name": "react-example",
"version": "1.0.0",
"lockfileVersion": 1,
"dependencies": {
"has-flag": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/has-flag/-/has-flag-1.0.0.tgz",
"integrity": "sha1-nZ55MWXOAXoA8AQYxD+UKnsdEfo="
},
"supports-color": {
"version": "3.2.3",
"resolved": "https://registry.npmjs.org/supports-color/-/supports-color-3.2.3.tgz",
"integrity": "sha1-ZawFBLOVQXHYpklGsq48u4pfVPY="
}
}
}
Note that here all the packages listed in the first dependencies object are hoisted. This means that npm can use only the package-lock as the source of truth in order to build the final dependency graph whereas Yarn needs the accompanying package.json
to seed it.
This means that npm has better assurances around hoisting position across npm versions at the cost of having a more dense lockfile. There’s currently no plan on how to support package-lock.json
in Yarn as the story around lockfile interoperability is unclear. You could however imagine a future where Yarn supports both and updates them in tandem. We’re very interested in community feedback and encourage proposals for how this would work to be submitted as an RFC.
Closing remarks
Each lock file format has different tradeoffs and there doesn’t appear to be a perfect format without disadvantages. It’s important to evaluate what sort of guarantees you’re looking for when deciding what package manager to use.
npm 5 has stronger guarantees across versions and has a stronger deterministic lockfile, but Yarn only has those guarantees when you’re on the same version in favor of a lighter lockfile that is better for review. It’s possible that there’s a lockfile solution that has the best of both worlds, but for now this is current state of the ecosystem and possible convergence could happen in the future.
Hopefully this post has highlighted the determinism guarantees that differ between Yarn and npm and helped you decide what works better for your company or project.