What is a good complete regular expression or some other process that would take the title:
How do you change a title to be part of the URL like Stack Overflow?
and turn it into
how-do-you-change-a-title-to-be-part-of-the-url-like-stack-overflow
that is used in the SEO-friendly URLs on Stack Overflow?
The development environment I am using is Ruby on Rails, but if there are some other platform-specific solutions (.NET, PHP, Django), I would love to see those too.
I am sure I (or another reader) will come across the same problem on a different platform down the line.
I am using custom routes, and I mainly want to know how to alter the string to all special characters are removed, it's all lowercase, and all whitespace is replaced.
Brian's code, in Ruby:
downcase
turns the string to lowercase,strip
removes leading and trailing whitespace, the firstgsub
call globally substitutes spaces with dashes, and the second removes everything that isn't a letter or a dash.Here's how we do it. Note that there are probably more edge conditions than you realize at first glance.
This is the second version, unrolled for 5x more performance (and yes, I benchmarked it). I figured I'd optimize it because this function can be called hundreds of times per page.
To see the previous version of the code this replaced (but is functionally equivalent to, and 5x faster), view revision history of this post (click the date link).
Also, the
RemapInternationalCharToAscii
method source code can be found here.Here's my (slower, but fun to write) version of Jeff's code:
My test string:
" I love C#, F#, C++, and... Crème brûlée!!! They see me codin'... they hatin'... tryin' to catch me codin' dirty... "
Here is my version of Jeff's code. I've made the following changes:
The case conversion is now also optional.
For more details, the unit tests, and an explanation of why Facebook's URL scheme is a little smarter than Stack Overflows, I've got an expanded version of this on my blog.
If you are using Rails edge, you can rely on Inflector.parametrize - here's the example from the documentation:
Also if you need to handle more exotic characters such as accents (éphémère) in previous version of Rails, you can use a mixture of PermalinkFu and DiacriticsFu:
You will want to setup a custom route to point the URL to the controller that will handle it. Since you are using Ruby on Rails, here is an introduction in using their routing engine.
In Ruby, you will need a regular expression like you already know and here is the regular expression to use: