Site not valid - but it is

2020-03-24 04:25发布

So, I'm building a website called "dagbok.nu", which is swedish for "diary now" :)

Anyway, when creating the Facebook application, it claims that the site URL is invalid as well as the app domain. For site url, I used "http://dagbok.nu" and for site domain, I used "dagbok.nu". Please don't reply (as I've seen others do on similar issues) that I should type the site url with the scheme and the domain without - that's exactly what I'm doing.

Right, so according to another question here, one could trouble shoot this functionality using FB's own URL scraper, so I did just that:

http://developers.facebook.com/tools/debug/og/object?q=http%3A%2F%2Fdagbok.nu

And the reply: Error Parsing URL: Error parsing input URL, no data was scraped Right, so now I can assume that the reason for it being considered invalid is because of FB not being able to scrape the URL. But why?

According to this question, one of the reasons seems to be that FB has deemed the URL insecure or "spammy". I've acquired this domain from a previous owner so this wasn't all that impossible. But when doing the same thing as Matthew in that post - i.e. trying to post in my timeline using the domain "http://dagbok.nu", I didn't get any information. The status box expanded as if to include a thumbnail and information about the link, but it only contained a "(No title)" text and nothing more. Screengrab

So now I don't know what to do. I've tried to check the DIG and NS records from multiple servers around the web, and everyone seems to resolve it correctly, and I've had friends double check the URL from the states as well. I can't understand what's wrong and I have no idea how to ask someone at FB how to resolve this. Does anyone here have a good advice for this? Thanks in advance! :)

EDIT When changing the domain to another domain that points to the exact same web server and document_root, it works! So this is definitely a problem with the domain "dagbok.nu" and not with the code on that page.

EDIT When using the debug function above - I see no activity in the server log what so ever. Facebook doesn't even contact the server. When using the alternate url - the one from the last edit, it pops up in the logs as it should.

EDIT I filed a bug report with Facebook, And their first response was that they were going to follow up. Now, a month later, I got an email that said "We are prioritizing bugs based on impact to the developer community. As this bug report has not received much attention from other developers, we are closing it so as to better focus on the top issues", and then they told me to go here to stackoverflow to try to solve my issue - but the issue is WITH THEM, and of course no one else have reported that my site doesn't work, it affects only me, and I haven't opened it yet due to this bug!

EDIT I wanted to file a new bug report, but I can't even that now, since they are blocking bug reports with this URL as well!

I had to edit the URL - here is the new bug report

7条回答
ゆ 、 Hurt°
2楼-- · 2020-03-24 04:27

This issue may also happen when Cloudflare is used. This is because Cloudflare protects the page from Facebook, which is then unable to collect the data, which in turn makes Facebook think the page is invalid.

My fix was:

  1. Turn off Cloudflare for the page.
  2. Scrape the page using Facebook's Dev Tools: https://developers.facebook.com/tools/debug/og/object
  3. Click and let run the "Fetch new scrape information" button.
  4. Re-enable cloudflare protection for the page.

You should then be able to continue to add the page where you needed.

查看更多
我想做一个坏孩纸
3楼-- · 2020-03-24 04:30

Had the same problem and I discovered it was an incorrect IPv6 address in the AAAA records for my domain. The IPv4 record was correct, so the site worked in a browser but FB obviously check the IPv6 records!

查看更多
劫难
4楼-- · 2020-03-24 04:33

When Facebook tries to scrap your site for information, they send a call to your server with specific user agent called "facebookexternalhit"...

Facebook needs to scrape your page to know how to display it around the site.

Facebook scrapes your page every 24 hours to ensure the properties are up to date. The page is also scraped when an admin for the Open Graph page clicks the Like button and when the URL is entered into the Facebook URL Linter. Facebook observes cache headers on your URLs - it will look at "Expires" and "Cache-Control" in order of preference. However, even if you specify a longer time, Facebook will scrape your page every 24 hours.

The user agent of the scraper is: "facebookexternalhit/1.1(+http://www.facebook.com/externalhit_uatext.php)"

  1. Make sure it is not blocked by your server firewall
  2. Look in your server log if it even tried to access your site
  3. If you think this is a firewall issue look at this link
查看更多
我命由我不由天
5楼-- · 2020-03-24 04:33

This doesn't seem to be a Facebook problem if you take a look at what I've discovered.

The results when testing it with W3C Online Validation Tool are 1 of 2 results.

Tested using: dagbok.nu but note http://dagbok.nu has no difference in test results. Remove the last forward slash in between tests.


Test: 1
Results: 72 Errors 0 Warning
Note: Shown here is a fragment of the source Frameset DOCTYPE webpage.

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">
<NOSCRIPT><IMG SRC="http://svs.bystorm.se/rv?java=off"></NOSCRIPT><SCRIPT SRC="http://svs.bystorm.se/rvj"></SCRIPT>
<HTML STYLE="height:100%;">
<HEAD>
<META HTTP-EQUIV="content-type" CONTENT="text/html;charset=iso-8859-1">



Test: 2
Results: 4 Errors 1 Warning
Note: Shown here is a fragment of the source Transitional DOCTYPE webpage.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html >
<head>
<title>Dagbok: Framsida</title>
<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
<meta name="author" content="Jonas Eklundh Communication (http://jonas.eklundh.com)">
<meta name="author-email" content="jonas@eklundh.com">
<meta name="copyright" content="Jonas Eklundh Communication @2012">
<meta name="keywords" content="Atlas,Inneh&aring;llssystem,Jonas Eklundh">
<meta name="description" content="">
<meta name="creation-time" content="0,079s">
<meta name="kort" content="DGB">


Repeated tests loop these results when done a couple seconds apart indicating a page-redirect is occurring.

Security warnings are seen in Firefox and Chrome when visiting your site using these secure URL's:
https://dagbok.nu
https://www.dagbok.nu

The browser indicates the site should not be trusted because it's impersonating another site using invalid security certificate from *.loopiasecure.com

Recommendation: Check your .htaccess file, CMS Settings, page redirection, and security settings. Use the above source webpages to realize those file-locations / file-names that are being served to discover what's set incorrectly.

Once that's done, I think Facebook will be happy to then debug your webpage and provide additional recommendations.

查看更多
男人必须洒脱
6楼-- · 2020-03-24 04:37

Your problem appears to be with your character encoding string. Your Apache server is currently sending the unsupported string latin1. You've defined your meta:content-type as iso-8859-1. See the w3c validator

From what I've seen, the Facebook parser will stop immediately if it encounters either an unrecognized character encoding string or a mismatch in character encoding strings between your header and meta tags.

The problem could be originating from either your httpd.conf or php.ini files. Change these to match your meta and restart Apache. Since the problem seems to be domain-specific, I'd check httpd.conf first.

查看更多
孤傲高冷的网名
7楼-- · 2020-03-24 04:41

If you don't provide certain minimum Facebook markup on your page, it will respond with "Error Parsing URL: Error parsing input URL, no data was scraped." I only looked at the homepage, but it appears that dagbok.nu contains no Facebook markup. I'm not sure what things must be present at minimum, but in my implementation, I assume the fb:app_id meta tag and the JavaScript SDK script must be there. You may want to take a look at http://developers.facebook.com/docs/guides/web/#plugins , particularly the Authentication section.

I discovered your question because I had this same error today for an unknown reason. I found that it was caused because the content of my og:image meta tag used an incorrect URL to the image I was trying to use. So as you add Facebook markup to your page, make sure your values are correct or you may continue to receive this message.

查看更多
登录 后发表回答