How to get Uri.EscapeDataString to comply with RFC

2019-01-08 17:00发布

问题:

The Uri class defaults to RFC 2396. For OpenID and OAuth, I need Uri escaping consistent with RFC 3986.

From the System.Uri class documentation:

By default, any reserved characters in the URI are escaped in accordance with RFC 2396. This behavior changes if International Resource Identifiers or International Domain Name parsing is enabled in which case reserved characters in the URI are escaped in accordance with RFC 3986 and RFC 3987.

The documentation also states that activating this IRI mode and thus the RFC 3986 behavior means adding a uri section element to machine.config and this to your app/web.config file:

<configuration>
  <uri>
  <idn enabled="All" />
  <iriParsing enabled="true" />
  </uri>
</configuration>

But whether this is present in the .config file or not, I'm getting the same (non-3986) escaping behavior for a .NET 3.5 SP1 app. What else do I need to do to get Uri.EscapeDataString to use the RFC 3986 rules? (specifically, to escape the reserved characters as defined in that RFC)

回答1:

Having not been able to get Uri.EscapeDataString to take on RFC 3986 behavior, I wrote my own RFC 3986 compliant escaping method. It leverages Uri.EscapeDataString, and then 'upgrades' the escaping to RFC 3986 compliance.

/// <summary>
/// The set of characters that are unreserved in RFC 2396 but are NOT unreserved in RFC 3986.
/// </summary>
private static readonly string[] UriRfc3986CharsToEscape = new[] { "!", "*", "'", "(", ")" };

/// <summary>
/// Escapes a string according to the URI data string rules given in RFC 3986.
/// </summary>
/// <param name="value">The value to escape.</param>
/// <returns>The escaped value.</returns>
/// <remarks>
/// The <see cref="Uri.EscapeDataString"/> method is <i>supposed</i> to take on
/// RFC 3986 behavior if certain elements are present in a .config file.  Even if this
/// actually worked (which in my experiments it <i>doesn't</i>), we can't rely on every
/// host actually having this configuration element present.
/// </remarks>
internal static string EscapeUriDataStringRfc3986(string value) {
    // Start with RFC 2396 escaping by calling the .NET method to do the work.
    // This MAY sometimes exhibit RFC 3986 behavior (according to the documentation).
    // If it does, the escaping we do that follows it will be a no-op since the
    // characters we search for to replace can't possibly exist in the string.
    StringBuilder escaped = new StringBuilder(Uri.EscapeDataString(value));

    // Upgrade the escaping to RFC 3986, if necessary.
    for (int i = 0; i < UriRfc3986CharsToEscape.Length; i++) {
        escaped.Replace(UriRfc3986CharsToEscape[i], Uri.HexEscape(UriRfc3986CharsToEscape[i][0]));
    }

    // Return the fully-RFC3986-escaped string.
    return escaped.ToString();
}


回答2:

This has actually been fixed in .NET 4.5 to work by default, see here.

I just created a new library called PUrify (after running into this issue) which will handle getting this to work for .NET pre 4.5 (works for 3.5) and Mono through a variation of the approach in this post. PUrify doesn't change EscapeDataString but it does let you have Uris with reserved chars which will not be escaped.



回答3:

What version of the framework are you using? It looks like a lot of these changes were made in the (from MSDN) ".NET Framework 3.5. 3.0 SP1, and 2.0 SP1" timeframe.



回答4:

I could not find a better answer (either 100% framework or 100% reimplementation), so I've created this abomination. Seems to be working with OAuth.

class al_RFC3986
{
    public static string Encode(string s)
    {
        StringBuilder sb = new StringBuilder(s.Length*2);//VERY rough estimate
        byte[] arr = Encoding.UTF8.GetBytes(s);

        for (int i = 0; i < arr.Length; i++)
        {
            byte c = arr[i];

            if(c >= 0x41 && c <=0x5A)//alpha
                sb.Append((char)c);
            else if(c >= 0x61 && c <=0x7A)//ALPHA
                sb.Append((char)c);
            else if(c >= 0x30 && c <=0x39)//123456789
                sb.Append((char)c);
            else if (c == '-' || c == '.' || c == '_' || c == '~')
                sb.Append((char)c);
            else
            {
                sb.Append('%');
                sb.Append(Convert.ToString(c, 16).ToUpper());
            }
        }
        return sb.ToString();
    }
}


回答5:

I realize this question and answers are a few years old, but I thought I would share my finding when I had trouble getting compliance under .Net 4.5.

If your code is running under asp.net, just setting the project to target 4.5 and running on a machine with 4.5 or later, you may still get 4.0 behavior. You need to ensure <httpRuntime targetFramework="4.5" /> is set in the web.config.

From this blog article on msdn,

If there is no <httpRuntime targetFramework> attribute present in Web.config, we assume that the application wanted 4.0 quirks behavior.