Crawlable AJAX with _escaped_fragment_ in htaccess

2019-02-10 03:33发布

问题:

Hello fellow developers!

We are almost finished with developing first phase of our ajax web app. In our app we are using hash fragments like:

http://ourdomain.com/#!list=last_ads&order=date

I understand google will fetch this url and make a request to the server in this form:

http://ourdomain.com/?_escaped_fragment_=list=last_ads?order=date&direction=desc

everything is perfect, except...

I would like to route this kind of request to another script

like so:

RewriteCond %{QUERY_STRING} ^_escaped_fragment_=(.*)$
RewriteRule ^$ /webroot/crawler.php$1 [L]

The problem is, that when I try to print_r($_REQUEST) in crawler.php I get only:

Array
(
    [_escaped_fragment_] => list=last_ads?order=date
    [direction] => desc
)

what I'd like to get is

Array
(
    [list] => last_ads
    [order] => date
    [directions] => des
)

I know I could use php to further break the first argument, but I don't want to ;)

please advise

==================================================== EDIT... some corrections in text and logic

回答1:

Your forgot QSA directive (everyone missed the point =D )

RewriteCond %{QUERY_STRING} ^_escaped_fragment_=(.*)$
RewriteRule ^$ /webroot/crawler.php%1 [QSA,L]

By the way your $1 is well err... useless because it refers to nothing. So this should be:

RewriteCond %{QUERY_STRING} ^_escaped_fragment_=(.*)$
RewriteRule ^$ /webroot/crawler.php [QSA,L]

Tell me if this works.



回答2:

If I'm not mistaken.

RewriteCond %{QUERY_STRING} ^_escaped_fragment_=(.*)$
RewriteRule ^$ /webroot/crawler.php?%1 [L]


回答3:

Maybe is obvious for you, but in the documentation talk about escaped characters: Set up your server to handle requests for URLs that contain

The crawler escapes certain characters in the fragment during the transformation. To retrieve the original fragment, make sure to unescape all %XX characters in the fragment. More specifically, %26 should become &, %20 should become a space, %23 should become #, and %25 should become %, and so on.



回答4:

Here is a solution that provides a routable URL and query parameters correctly set for processing in the server side script.

Example: If you want http://yoursite.com/#!/product/20 to become http://yoursite.com/crawler/product/20

First in .htaccess

RewriteCond %{QUERY_STRING} ^_escaped_fragment_=(.*)$
RewriteRule ^$ /crawler/index.php?_frag=%1  [L]

We need to get rid of the _escaped_fragment_ in the URL and replace it with something different, example: _frag so that the (Apache) web server does not get in to circular rewrites.

Second in crawler/index.php

<?php

if(array_key_exists('_frag', $_REQUEST)) {
    $_SERVER['REQUEST_URI']=$_REQUEST['_frag'];
    unset($_REQUEST['_frag']);
    parse_str($_SERVER['QUERY_STRING'], $frag); 
    parse_str(preg_replace('/^.*\?/', '', $frag['_frag']), $_REQUEST);
    $_SERVER['QUERY_STRING'] = http_build_query($_REQUEST);
}

// Continue with your usual script of routing
// $_REQUEST now contains the original query parameters


回答5:

In htacess work in virtual host not working, so i add in "directory"

<Directory "X:/DIR">
    RewriteEngine On
    RewriteCond %{QUERY_STRING} ^_escaped_fragment_=(.*)$
    RewriteRule ^$ /crawler/index.php?_frag=%1  [L]
</Directory>