is `addingPercentEncoding` broken in Xcode 9?

2019-04-09 07:07发布

问题:

in Swift 3.x with Xcode 9 beta 2, using addingPercentEncoding gives unexpected results. CharacterSet.urlPathAllowed always contains ":", so by definition of addingPercentEncoding, it should never escape it. Yet, using this code:

// always true
print(CharacterSet.urlPathAllowed.contains(":"))
let myString = "info:hello world"
let escapedString = myString.addingPercentEncoding(withAllowedCharacters: .urlPathAllowed)!
print(escapedString)

I get those results:

cases where I get an undesirable behavior

  • Xcode 9 beta 2, iOS 9.3
  • Xcode 9 beta 2, iOS 11.0

    true
    info%3Ahello%20world

cases where I get the expected behavior

  • Xcode 9 beta 2, iOS 10.3.1
  • Xcode 8.3.3, any iOS

    true
    info:hello%20world

Is there any workaround to get a working implementation of addingPercentEncoding that will correctly respect the given allowedCharacters?

回答1:

Apparently there is some undocumented magic done by addingPercentEncoding when the CharacterSet used as reference is an underlying NSCharacterSet class.

So to workaround this magic, you need to make your CharacterSet a pure Swift object. To do so, I'll create a copy (thanks Martin R!), so that the evil magic is gone:

let myString = "info:hello world"
let csCopy = CharacterSet(bitmapRepresentation: CharacterSet.urlPathAllowed.bitmapRepresentation)
let escapedString = myString.addingPercentEncoding(withAllowedCharacters: csCopy)!
//always "info:hello%20world"
print(escapedString)

As an extension:

extension String {
    func safeAddingPercentEncoding(withAllowedCharacters allowedCharacters: CharacterSet) -> String? {
        // using a copy to workaround magic: https://stackoverflow.com/q/44754996/1033581
        let allowedCharacters = CharacterSet(bitmapRepresentation: allowedCharacters.bitmapRepresentation)
        return addingPercentEncoding(withAllowedCharacters: allowedCharacters)
    }
}


回答2:

The reason that it is percent escaping the : character even though it is in the .urlPathAllowed character set is that it appears to be strictly enforcing section 3.3 of RFC 3986, which says that the : is permitted in relative paths (which is what we're dealing with here), but not in the first component.

Consider:

let string = "foo:bar/baz:qux"
print(string.addingPercentEncoding(withAllowedCharacters: .urlPathAllowed)!)

That will, in conformance with RFC 3986, percent encode the : in the first component, but allow it unencoded in subsequent components:

foo%3Abar/baz:qux

The method name and nature of the documentation would lead one to conclude that it's percent encoding solely on the basis of what characters are allowed, but it actually looks like it's really considering whether it's a .urlPathAllowed, and applying RFC 3986's relative path logic.

As Cœur said, you can bypass this by building your own character set with the same allowed characters as .urlPathAllowed, and it doesn't apply any of this logic.