Should “node_modules” folder be included in the gi

2019-01-08 05:57发布

问题:

I'm wondering if we should be tracking node_modules in our repo or doing an npm install when checking out the code?

回答1:

The answer is not as easy as Alberto Zaccagni suggests. If you develop applications (especially enterprise applications), including node_modules in your git repo is a viable choice and which alternative you choose depends on your project.

Because he argued very well against node_modules I will concentrate on arguments for them.

Imagine that you have just finished enterprise app and you will have to support it for 3-5 years. You definitely don't want to depend on someone's npm module which can tomorrow disappear and you can't update your app anymore.

Or you have your private modules which are not accessible from the internet and you can't build your app on the Internet. Or maybe you don't want to depend on your final build on npm service for some reasons.

You can find pros and cons in this Addy Osmani article (although it is about Bower, it is almost the same situation). And I will end with a quote from Bower homepage and Addy's article:

“If you aren’t authoring a package that is intended to be consumed by others (e.g., you’re building a web app), you should always check installed packages into source control.”



回答2:

Modules details are stored in packages.json, that is enough. There's no need to checkin node_modules.

People used to store node_modules in version control to lock dependencies of modules, but with npm shrinkwrap that's not needed anymore.

Another justification for this point, as @ChrisCM wrote in the comment:

Also worth noting, any modules that involve native extensions will not work architecture to architecture, and need to be rebuilt. Providing concrete justification for NOT including them in the repo.



回答3:

I would recommend against checking in node_modules because of packages like PhantomJS and node-sass for example, which install the appropriate binary for the current system.

This means that if one Dev runs npm install on Linux and checks in node_modules – it won't work for another Dev who clones the repo on Windows.

It's better to check in the tarballs which npm install downloads and point npm-shrinkwrap.json at them. You can automate this process using shrinkpack.



回答4:

Not tracking node_modules with source control is the right choice because some NodeJS modules, like MongoDB NodeJS driver, use NodeJS C++ add-ons. These add-ons are compiled when running npm install command. So when you track node_modules directory, you may accidentally commit an OS specific binary file.



回答5:

This topic is pretty old, I see. But I'm missing some update to arguments provided here due to changed situation in npm's eco system.

I'd always advise not to put node_modules under version control. Nearly all benefits from doing so as listed in context of accepted answer are pretty outdated as of now.

  1. Published packages can't be revoked from npm registry that easily anymore. So you don't have to fear loosing dependencies your project has relied on before.

  2. Putting package-json.lock file in VCS is helping with frequently updated dependencies probably resulting in different setups though relying on same package.json file.

So, putting node_modules into VCS in case of having offline build tools might be considered the only eligible use case left. However, node_modules usually grows pretty fast. Any update will change a lot of files. And this is affecting repositories in different ways. If you really consider long-term affects that might be an impediment as well.

Centralized VCS' like svn require transferring committed and checked out files over the network which is going to be slow as hell when it comes to checking out or updating a node_modules folder.

When it comes to git this high number of additional files will instantly pollute the repository. Keep in mind that git isn't tracking differences between versions of any file, but is storing copies of either version of a file as soon as a single character has changed. Every update to any dependency will result in another large changeset. Your git repository will quickly grow huge because of this affecting backups and remote synchronization. If you decide to remove node_modules from git repository later it is still part of it for historical reasons. If you have distributed your git repository to some remote server (e.g. for backup) cleaning it up is another painful and error-prone task you'd be running into.

Thus, if you care for efficient processes and like to keep things "small" I'd rather use a separate artifacts repository such as Nexos Repository (or just some HTTP server with ZIP archives) providing some previously fetched set of dependencies for download.



回答6:

One more thing to consider: checking in node_modules makes it harder / impossible to use the difference between dependencies and devDependencies.

On the other hand though, one could say it's reassuring to push to production the exact same code that went through tests - so including devDependencies.



回答7:

node_modules is not required to be checked-in if dependencies are mentioned in package.json. Any other programmer can simply get it by doing npm install and the npm is smart enough to make the node_modules in you working directory for the project.



回答8:

I agree with ivoszz that it's sometimes useful to check the node_modules folder, but...


scenario 1:

One scenario: You use a package that gets removed from npm. If you have all the modules in the folder node_modules, then it won't be a problem for you. If you do only have the package name in the package.json, you can't get it anymore. If a package is less than 24 hours old, you can easily remove it from npm. If it's older than 24 hours old, then you need to contact them. But:

If you contact support, they will check to see if removing that version of your package would break any other installs. If so, we will not remove it.

read more

So the chances for this are low, but there is scenario 2...


scenario 2:

An other scenario where this is the case: You develop an enterprise version of your software or a very important software and write in your package.json:

"dependencies": {
    "studpid-package": "~1.0.1"
}

You use the method function1(x)of that package.

Now the developers of studpid-package rename the method function1(x)to function2(x) and they make a fault... They change the version of their package from 1.0.1 to 1.1.0. That's a problem because when you call npm install the next time, you will accept version 1.1.0 because you used the tilde ("studpid-package": "~1.0.1").

Calling function1(x) can cause errors and problems now.


But:

Pushing the whole node_modules folder (often more than 100 MB) to your repository, will cost you memory space. A few kb (package.json only) compared with hundreds of MB (package.json & node_modules)... Think about it.

You could do it / should think about it if:

  • the software is very important.

  • it costs you money when something fails.

  • you don't trust the npm registry. npm is centralized and could theoretically be shut down.

You don't need to publish the node_modules folder in 99.9% of the cases if:

  • you develop a software just for yourself.

  • you've programmed something and just want to publish the result on GitHub because someone else could maybe be interested in it.


If you don't want the node_modules to be in your repository, just create a .gitignore file and add the line node_modules.