I'm trying to do a simple regex match using NSRegularExpression, but I'm having some problems matching the string when the source contains multibyte characters:
let string = "D 9"
// The following matches (any characters)(SPACE)(numbers)(any characters)
let pattern = "([\\s\\S]*) ([0-9]*)(.*)"
let slen : Int = string.lengthOfBytesUsingEncoding(NSUTF8StringEncoding)
var error: NSError? = nil
var regex = NSRegularExpression(pattern: pattern, options: NSRegularExpressionOptions.DotMatchesLineSeparators, error: &error)
var result = regex?.stringByReplacingMatchesInString(string, options: nil, range: NSRange(location:0,
length:slen), withTemplate: "First \"$1\" Second: \"$2\"")
The code above returns "D" and "9" as expected
If I now change the first line to include a UK 'Pound' currency symbol as follows:
let string = "£ 9"
Then the match doesn't work, even though the ([\\s\\S]*)
part of the expression should still match any leading characters.
I understand that the £
symbol will take two bytes but the wildcard leading match should ignore those shouldn't it?
Can anyone explain what is going on here please?