I've been searching for npm packages but they all seem unmaintained and rely on the outdated user-agent databases. Is there a reliable and up-to-date package out there that helps me detect crawlers? (mostly from Google, Facebook,... for SEO) or if there's no packages, can I write it myself? (probably based on an up-to-date user-agent database)
To be clearer, I'm trying to make an isomorphic/universal React website and I want it to be indexed by search engines and its title/meta data can be fetched by Facebook, but I don't want to pre-render on all normal requests so that the server is not overloaded, so the solution I'm thinking of is only pre-render for requests from crawlers
I have nothing to add for your search for npm packages. But your question for an up to date user agent database to do build your own package, I would recommend ua.theafh.net
It has, in the moment, data up to Nov 2014 and as far as I know it is with more than 5.4 million agents also the largest search engine for user agents.
The best solution I've found is the useragent library, which allows you to do this:
var useragent = require('useragent');
// for an actual request use: useragent.parse(req.headers['user-agent']);
var agent = useragent.parse('Googlebot-News');
// will log true
console.log(agent.device.toJSON().family === 'Spider')
It is fast and kept up-to-date pretty well. Seems like the best approach. Run the above script in your browser: runkit