I typically use URL rewriting to pass content IDs to my website, so this
Foo.1.aspx
rewrites to
Foo.aspx?id=1
For a specific application I need to pass in multiple IDs to a single page, so I've rewritten things to accept this:
Foo.1,2,3,4,5.aspx
This works fine in Cassini (the built-in ad hoc web server for Visual Studio) but gives me "Internet Explorer cannot display the webpage" when I try it on a live server running IIS. Is this an IIS limitation? Should I just use dashes or underscores instead of commas?
I recall that Url Routing by default first checks to see if the file exists, and commas are not legal in filenames, which is parhaps why you are getting errors. IIS may have legacy code that aborts the request before it can get to asp.net for processing.
Scott Hanselman's blog post talks a bit about this and may be relevant for you.
As general comment: Url rewriting is typically used to make a url friendly and easy to remember.
~/page.aspx?id=1,2,3,4
is neither worse nor better than ~/page/1-2-3-4.aspx
: both are difficult to use so why go through the extra effort? Avoid creating new url forms just because you can. Users, help desk, and other developers will just be confused.
Url rewriting is best utilized to transform
~/products/view.aspx?id=1
~/products/category.aspx?type=beverage
into
~/products/view/1
~/products/category/beverage
Commas are allowed in the filename part of a URL, but are reserved characters in the domain*, as far as I know.
What version of IE are you using? I've come across the odd report of IE5.5 truncating URLs on a comma (link here, but have tested URLs with commas in IE7 and it seems to be OK, so if there was an IE bug, it doesn't seem to be there any more - could it be an IIS issue?
I'm wondering if the page error is due to a rule failure with the mod_rewrite
- can you post the rule which is matching multiple ids and passing them off to your Foo.aspx
? Is there any chance that it's only matching Foo.N,N
, and failing on more commas?
* From the URI RFC:
2.2. Reserved Characters
Many URI include components consisting of or delimited by, certain
special characters. These characters are called "reserved", since
their usage within the URI component is limited to their reserved
purpose. If the data for a URI component would conflict with the
reserved purpose, then the conflicting data must be escaped before
forming the URI.
reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" |
"$" | ","
The "reserved" syntax class above refers to those characters that are
allowed within a URI, but which may not be allowed within a
particular component of the generic URI syntax
Try using %2c
in the URL to replace the commas.
The comma is allowed in the path, query string and fragment according to spec. It wouldn't surprise me if IE doesn't conform to the spec though. Try the entity as Claudiu suggests, but I don't know why that would be necessary.
In addition to the answer by ConroyP, below is another citation to the RFC. It notes a number of unsafe characters, but does not mention the comma (suggesting that the comma is safe):
Characters can be unsafe for a number of reasons. The space
character is unsafe because significant spaces may disappear and
insignificant spaces may be introduced when URLs are transcribed or
typeset or subjected to the treatment of word-processing programs.
The characters "<" and ">" are unsafe because they are used as the
delimiters around URLs in free text; the quote mark (""") is used to
delimit URLs in some systems. The character "#" is unsafe and should
always be encoded because it is used in World Wide Web and in other
systems to delimit a URL from a fragment/anchor identifier that might
follow it. The character "%" is unsafe because it is used for
encodings of other characters. Other characters are unsafe because
gateways and other transport agents are known to sometimes modify
such characters. These characters are "{", "}", "|", "\", "^", "~",
"[", "]", and "`".
All unsafe characters must always be encoded within a URL. For
example, the character "#" must be encoded within URLs even in
systems that do not normally deal with fragment or anchor
identifiers, so that if the URL is copied into another system that
does use them, it will not be necessary to change the URL encoding.
The right way to accept multiple ids is like this:
Foo.aspx?id=1;id=2;id=3;id=4;id=5
Note that's just what the target is. When re-writing urls, you can set your own rules to a certain extent for what you want the source to look like.
I had to learn this on StackOverflow, too. See this question:
Split out ints from string
Answer
The problem was the commas. I'm guessing that IIS was having an issue with it (not IE) since IE was able to display it fine on localhost.
At any rate I just changed the URL format to this and it works fine:
Foo.1-2-3-4-5.aspx
If you'd put in place a front controller then you could do something like;
index.aspx?c=Foo/1/2/3/4
The Front Controller would pick up the method name and the parameters to pass to it. This is a pretty common technique nowadays.