Use htaccess to fix misspelled urls

2019-07-05 07:11发布

So I have a pretty simple problem (at least I think do) with my website. I need to be able to redirect any misspelled URLs to the correct ones. It's easier if I explain it to you guys than to describe it.

For example, let's take this url.

http://www.tomshardware.com/reviews/radeon-r9-290x-hawaii-review,3650.html

Now, that url will take you to the correct page of that article regardless of how the url is spelled. Say you accidentally place a letter, number or a word into that URL to something like this:

http://www.tomshardware.com/reviews/radeon-r9-290x-TEST-TEST-hawaii-review,3650.html

That url will still take you to the correct article and fix itself to the correct URL. You could add anything to that URL and it will still take you to the right article regardless what you accidentally type into it.

So my question is how do I do this in htaccess? This is my current htaccess file

# Secure htaccess file
<files .htaccess>
order allow,deny
deny from all
</files>

AddHandler application/x-httpd-php5 .html .htm
AddType application/x-httpd-php .html .htm .php
AddHandler cgi-script .pl .cgi
Options ALL -Indexes -Multiviews +ExecCGI +FollowSymLinks

# Do not remove this line, otherwise mod_rewrite rules will stop working
RewriteBase /

RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME}\.html -f
RewriteRule ^(.*)$ $1.html

#Redirect Non-WWW to WWW
RewriteCond %{HTTP_HOST} !^www\.
RewriteRule ^(.*)$ http://www.%{HTTP_HOST}/$1 [R=301,L]

RewriteCond %{REQUEST_URI} /index\.html?$ [NC]
RewriteRule ^(.*)index\.html?$ "/$1" [NC,R=301,NE,L]

2条回答
一纸荒年 Trace。
2楼-- · 2019-07-05 07:31

You probably can't do that in that way.

As you can observe, the text on the url is totally irrelevant and is only there to create readable and index-friendly (SEO) urls. Those words are called "slugs", see http://en.wikipedia.org/wiki/Clean_URL#Slug If you modify the last part, the 3650 it will break the url because this is the only identifier which typically corresponds to a unique ID in the database.

Assumption on how and why the mentioned site do this: The site uses either a standalone routing component (e.g. Routing from Symfony PHP framework: http://symfony.com/components/Routing), an entire web framework or everything is written by hand. Depending on the language it might be ZEND, Symfony, etc for PHP, MVC for Asp.net or any other.

In all cases there is some sort of filtering of urls before the original content is served. The routing parses the url, retrieves the unique ID, fetches the data set and creates again an absolute URL out of it. It then compares the freshly generated route with the one you have entered. If they don't match the framework issues a http status of 30x and redirects you to the new url. The purpose of that is to maintain link sanity when the slug tags have changed or for whatever reason the SEO friendly url layout have changed. The redirect is there so the old fashioned urls are updated next time a search engine visits the page and updates it's index. Imagine you have a typo somewhere in the slugs or you forgot to mention Radeon and you want to avoid having it forever broken or wrong in the DB. So you need to fix it but at the same time you want to avoid breaking the old urls for search indexes which have not yet revisited your site with the new slugs or users that have bookmarked it.

After the redirect it again compares the urls and after they match the content is served.

A DB lookup is very likely here and you cannot do this properly with htaccess alone as you have no knowledge about correctness of the url here.

查看更多
啃猪蹄的小仙女
3楼-- · 2019-07-05 07:41

You would internal-redirect all article pages to a php program and it will match the parameters with best possible page to show

-- .htaccess --
RewriteEngine on
RewriteRule ^article/(.*).html$     /article.php?url=$1     [L]

-- php --
read article selection criteria
$article_url=$_GET['url'];
Search through database or files and show the article
查看更多
登录 后发表回答