Prevent image hotlinking in Google Image Search

2020-02-17 07:16发布

问题:

Just recently, Google has introduced a new interface of their Image Search. From January 25 2013 on, full size images are shown directly inside Google, without sending visitors to the source site. I came across a site, that apparently has developed a sophisticated approach to prevent users from grabbing images from Google by introducing some sort of watermark dynamically. To see this, please search on the new Google Image Search interface for images by "fansshare.com". This link should be working: Google Image Search. If not, simply enter "site:fansshare.com" in Google search input filed. Be sure to be on the new search interface, though.

How does fansshare.com achieve this? I couldn't figure it out ...

Update:

fansshare.com adds a GET param to all of their image URLs, like ?rnd=69. Example image URL: http://fansshare.com/media/content/570_Jessica-Biel-talks-Kate-Beckinsale-Total-Recall-fight-5423.jpg?rnd=62

This image URL works for a few calls or seconds, after which a redirect takes place to a cached, watermarked image: http://fansshare.com/cached/?version=media/content/570_Jessica-Biel-talks-Kate-Beckinsale-Total-Recall-fight-5423.jpg&rnd=5810

Edit:

We have finally managed to fully mimic FansShare's hotlink protection and we've published our findings in the following, extensive blog post:

http://pixabay.com/en/blog/posts/hotlinking-protection-and-watermarking-for-google-32/

回答1:

There is a solution but just like other solutions it's up to Google to intepret it as cloaking and ban at their will. This is a long one and probably will need further tinkering to work for your case. (Sorry in advance for the length)

Setup

For the sake of the example, let's just say that:

  • site: www.thesite.com and
  • ImageURL base: images.thesite.com

(but ImageURL base could easily be www.thesites.com/wp-content/uploads)

Target

Our target is to make it so, (1) the full-size image is shown only with a watermark/overlay if it's requested from google images search and (2) don't break previously working stuff.

Solution

So the theoretical solution is the following.

1) Check the User-Agent and if it contains Googlebot then serve the "trap" URL. The trap URL is your current image URL but slightly changed so you can treat it differently, so instead of the current normal:

http://images.thesite.com/wallpapers/awesome.jpg

you should print for Googlebots:

http://cacheimages.thesite.com/wallpapers/awesome.jpg

(where cacheimages is anything you want)

2) Now the main dish; you should be able to target the requests to http://cacheimages.thesite.com/ and have a script that acts like following:

 If the request comes from a bot (check user-agent headers)
     Then serve the normal image without watermark
 Else (if the request seems to be from a normal user)
     Then check the referer: If it's from google (but NOT http://www.google.com/blank.html)
          Redirect to the Post of the image (Note 1.)
     Else if the refer is your site
          Show the raw normal image
     Else (any other referer, including http://www.google.com/blank.html)
          Show watermarked image (Note 2.)

Note 1: This will happen when people click "View original image" or the image itself

Note 2: This will happen when people try to see the full-size image from the google image search results (and if they somehow arrive to the trap url of an image)

3) You could HTTP redirect the old images to the new ImageURL base if the user-agent is Googlebots so the overlay/watermark trick starts working on old images faster (or even use Google Webmaster Tools if you use subdomains for images) and you are sure to preserve the SEO juice.

Further actions

You could do more changes if you want to be serious.

  1. Instead of showing the watermarked image redirect to more dynamic url http://cacheimages.thesite.com/preview?p=/wallpapers/awesome.jpg&r=23535 or the more modern use of HTTP headers for no indexing: X-Robots-Tag: noindex
  2. Of course cache the watermarked images
  3. Check the Accept http headers for cases that I haven't thought and serve image or redirect image post accordingly.

Note

You may also have to think about international traffic so instead of google.com you want to check for google.[a-z-\.]+/

Conclusion

This could be adapted to any system, I made it for one that has images on a subdomain, so it probably won't be exactly the same for other systems like wordpress etc. Also, I am sure Google will do a change on their image search in the following couple months to fix this issue.

An untested sample implementation of the idea can be found on Github.

Disclaimers

This hasn't been tested thoroughly and you could get banned, it's merely provided for research and educational purposes. I cannot be held responsible for any damages etc.



回答2:

A couple of new wordpress plugins are available to address google and bing hotlinking images:

http://wordpress.org/extend/plugins/imaguard/ http://wordpress.org/extend/plugins/google-break-dance/



回答3:

Hi there here's a new plugin to address this issue on WordPress

https://github.com/mompracem/direct-images-redirect

Instead of using watermarked images, it just redirects the user who tries to access an image directly to the post or page where that image was originally attached to.

It's a new plugin therefore might have some bug, please test and report issues over github thank you



回答4:

hm ... about sending a different image or url to Googlebots, compared with regular users is not ok ! Images should be silent-redirected ().

For Wordpress blogs, WP-PICShield I think it's one of the best options !

  • Caching Support,
  • Pass-Through Images Request
  • Anti-IFRAME Protection,
  • Custom image transprency
  • Custom PNG watermark
  • HostName over images as url and/or in QR-BarCode !!!
  • Redirect direct-link to: attachment, single/gallery, or home
  • Protection against unauthorized requests
  • Avoid memory errors for big files
  • Allow Online Translators
  • Allow share button for socials sites:Facebook, Pinterest, Thumblr, Twitter, Google Plus
  • Allow Wordpress via RPC and Twitter via OAuth
  • Manual Clear Cache script avoid php limit execution
  • Allow remote ip list
  • +++ CDN Tools and helps

and more...



回答5:

I finally found a way to stop Google Image Search from hotlinking my photos without the use of a plugin. I hope this helps anyone who is still dealing with the aftermath of this completely evil decision by Google.