NSAttributedString and emojis: issue with position

2019-03-11 08:26发布

I'm coloring some parts of a text coming from an API (think "@mention" as on Twitter) using NSAttributedString.

The API gives me the text and an array of entities representing the parts of the text that are mentions (or links, tags, etc) which should be colored.

But sometimes, the coloration is offset because of emojis.


For example, with this text:

"@ericd Some text. @apero"

the API gives:

[ { "text" : "ericd", "len" : 6, "pos" : 0 }, { "text" : "apero", "len" : 6, "pos" : 18 } ]

which I successfully translate to an NSAttributedString using NSRange:

for m in entities.mentions {
    let r = NSMakeRange(m.pos, m.len)
    myAttributedString.addAttribute(NSForegroundColorAttributeName, value: someValue, range: r)
}

We see that "pos": 18 is correct, this is where "@apero" starts. The colored parts are "@ericd" and "@apero", as expected.

but when some specific combinations of emojis are used in the text, the API does not translate well to NSATtributedString, the coloration is offset:

"@ericd Some text.

1条回答
再贱就再见
2楼-- · 2019-03-11 09:12

A Swift String provides different "views" on its contents. A good overview is given in "Strings in Swift 2" in the Swift Blog:

  • characters is a collection of Character values, or extended grapheme clusters.
  • unicodeScalars is a collection of Unicode scalar values.
  • utf8 is a collection of UTF–8 code units.
  • utf16 is a collection of UTF–16 code units.

As it turned out in the discussion, pos and len from your API are indices into the Unicode scalars view.

On the other hand, the addAttribute() method of NSMutableAttributedString takes an NSRange, i.e. the range corresponding to indices of the UTF-16 code points in an NSString.

String provides methods to "translate" between indices of the different views (compare NSRange to Range<String.Index>):

let text = "@ericd Some text.                                                                     
查看更多
登录 后发表回答