So I built an app using MEAN.js, and I made some updates to the Articles (blog) section for better SEO, readability, design, etc. One problem I can't seem to figure out, though, is how to share the Articles using Facebook, Google+, Twitter, etc. and have them populate the right data using og meta tags.
WHAT I WANT
All I want is to be able to share Articles (blog posts) from my MEAN.js application, and have the article content show up when I post the link in Social sites (e.g. Facebook).
WHAT I HAVE TRIED
I've tried creating a separate server layout specifically for blog posts, but this breaks so many other things that I realized the amount of work probably wasn't worth it - there has to be a smarter way.
I've also tried updating og meta tag data with Angular on the client side, but these values must not get updated before Social sites grab those tags...in other words, it didn't actually do what I wanted it to.
I've tried grabbing the Angular route URL when the index is rendering so I can update those og meta values before the index is rendered, but I can't find these values anywhere in the req
data.
WHAT I THINK THE PROBLEM IS
Conceptually, this is what I believe is happening:
The request hits my server, but since it's a single page application using Angular's routing, the req.url
value is simply the root page ('/').
The index file gets loaded, which uses the standard server template layout.
Angular gets loaded and makes an AJAX call to get the Article data, then binds that data to the variables on the page.
So basically the layout is getting rendered (with the og meta values) before Angular even figures out what article information to grab.
WHAT I'M GUESSING THE IDEAL SOLUTION IS
In my express.js
file, the app's local variables are set as follows:
// Setting application local variables
app.locals.siteName = config.app.siteName;
app.locals.title = config.app.title;
app.locals.description = config.app.description;
app.locals.keywords = config.app.keywords;
app.locals.imageUrl = config.app.imageUrl;
app.locals.facebookAppId = config.facebook.clientID;
app.locals.jsFiles = config.getJavaScriptAssets();
app.locals.cssFiles = config.getCSSAssets();
These local variables are then rendered by Swig in the layout.server.view.html
file as follows:
// Note the {{keywords}}, {{description}}, etc. values.
<!-- Semantic META -->
<meta id="keywords" name="keywords" content="{{keywords}}">
<meta id="desc" name="description" content="{{description}}">
<!-- Facebook META -->
<meta id="fb-app-id" property="fb:app_id" content="{{facebookAppId}}">
<meta id="fb-site-name" property="og:site_name" content="{{siteName}}">
<meta id="fb-title" property="og:title" content="{{title}}">
<meta id="fb-description" property="og:description" content="{{description}}">
<meta id="fb-url" property="og:url" content="{{url}}">
<meta id="fb-image" property="og:image" content="{{imageUrl}}">
<meta id="fb-type" property="og:type" content="website">
<!-- Twitter META -->
<meta id="twitter-title" name="twitter:title" content="{{title}}">
<meta id="twitter-description" name="twitter:description" content="{{description}}">
<meta id="twitter-url" name="twitter:url" content="{{url}}">
<meta id="twitter-image" name="twitter:image" content="{{imageUrl}}">
So ideally I think we want to update these values with Article specific information before rendering the page. The problem is, if the layout gets rendered before Angular even figures out which article data to populate, how can I do this? Again, the Angular route doesn't appear to be available anywhere in the req
object, so I'm completely stumped on how to do this.
So I go back to my original desire - how can I share my articles on social media in a "pretty" way using MEAN.js? Am I on the right track? Is it possible with the current Article setup? Do I need to build a complete blogging module that doesn't use Angular at all?
I finally got this working for my application without Nginx or anything else outside of the MEANJS framework. Your mileage may vary, but I thought I'd share the results anyway. It works for me, but may not for you.
Basically what I already had setup was a way to grab non-hashed URLs and redirect to the hashed URLs. So a user could share their profile, e.g. example.com/myprofile
and it would redirect to example.com/#!/profile/myprofile
.
I then created a separate layout strictly for social bots (though in retrospect I'm not sure this was entirely necessary) and served the separate layout when the site is scraped. This I do thusly:
social-layout.server.view.html
...some stuff here...
//Note the variable names, e.g. {{siteName}}
<meta id="fb-app-id" property="fb:app_id" content="{{facebookAppId}}">
<meta id="fb-site-name" property="og:site_name" content="{{siteName}}">
<meta id="fb-title" property="og:title" content="{{socialTitle}}">
<meta id="fb-description" property="og:description" content="{{socialDescription}}">
<meta id="fb-url" property="og:url" content="{{socialUrl}}">
<meta id="fb-image" property="og:image" content="{{socialImageUrl}}">
<meta id="fb-type" property="og:type" content="website">
...other stuff here...
Then in my Express file, I explicitly check user-agents
to determine if a new layout is necessary. If I find a bot, I fetch some key data related to the URL from my DB, then populate the variables, like so:
express.js
// This code happens just after app.locals variables are set.
// Passing the request url to environment locals
app.use(function(req, res, next) {
// Let's check user-agents to see if this is a social bot. If so, let's serve a different layout to populate the og data so it looks pretty when sharing.
if(req.headers['user-agent'] === 'facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)' ||
req.headers['user-agent'] === 'facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php)' ||
req.headers['user-agent'] === 'facebookexternalhit/1.1 (+https://www.facebook.com/externalhit_uatext.php)' ||
req.headers['user-agent'] === 'facebookexternalhit/1.0 (+https://www.facebook.com/externalhit_uatext.php)' ||
req.headers['user-agent'] === 'visionutils/0.2' ||
req.headers['user-agent'] === 'Twitterbot/1.0' ||
req.headers['user-agent'] === 'LinkedInBot/1.0 (compatible; Mozilla/5.0; Jakarta Commons-HttpClient/3.1 +http://www.linkedin.com)' ||
req.headers['user-agent'] === 'Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20110814 Firefox/6.0 Google (+https://developers.google.com/+/web/snippet/)' ||
req.headers['user-agent'] === 'Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko Firefox/11.0 (via ggpht.com GoogleImageProxy)') {
var urlAttempt = req.url;
urlAttempt = urlAttempt.substr(1);
Users.findOne({ link: urlAttempt }, function(err, results) {
if(err) {
res.locals.url = req.protocol + '://' + req.headers.host;
next();
} else if (results !== null) {
// Found link. Populate data.
res.status(200).render('social-index', {
// Now we update layout variables with DB info.
socialUrl: req.protocol + '://' + req.headers.host + req.url,
socialTitle: results.orgName,
socialDescription: results.shortDesc,
socialImageUrl: req.protocol + '://' + req.headers.host + '/profile/img/' + results.imgName
});
} else {
res.locals.url = req.protocol + '://' + req.headers.host;
next();
}
});
} else {
res.locals.url = req.protocol + '://' + req.headers.host;
next();
}
});
Again, your mileage may vary, but this worked for me (partially). I'm still working on social sharing the whole URL (including the hash). Hope it helps in some way.
I've faced the same problem. First I installed Mean-Seo module. You can find it on mean.js official github repo. Module is essentially doing this: Crawlers (Google etc.) adds an _escaped_fragment_
part to their requests when they encounter a SPA URL. Mean-Seo intercepts requests which includes _escaped_fragment_
. Then using Phantom.Js, renders a static HTML out of dynamic one and saves it, serve this static version to crawlers.
For Facebook & Twitter, I changed mean-seo.js
file of Mean-Seo, and before it caches and saves the static file I replaced meta tags accordingly. Since Phantom.Js already rendered whole article page, so you do not need to make another API call. Just parse the HTML. I also used cheerio.js
to parse HTML conveniently.
This kind of solved my problem, but not perfectly. I'm still struggling with the hashbang & HTML5 mode. When my URL is like https://example.com/post/1
Twitter and Facebook do not request _escaped_fragment_
.
Update: After a while, I abandoned this aprroach. Phantomjs seems unreliable and I didn't like to waste system resources, CPU time, RAM, disk space for such a job. Unnecessary file creation is also silly. My current approach is like this:
I've added a new express route for Twitter and Facebook. In the server controller, implemented a new function for crawler. Created a simple server template without Angular, Bootstrap and all that shiny things: Only meta tags and simple text. Using Swig
(already included in Meanjs), rendered this view template accordingly. I'm also using Nginx
as a proxy. So I defined a new Nginx rewrite
based on user agent. Something like this:
if ($http_user_agent ~* "baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator") {
rewrite ^/posts/(.*)$ /api/posts/crawl/$1 last;
}
When a crawler request a post's URL, Nginx triggers my new simple crawler route and gets generated tiny page.