Tk text widget index expressions and Unicode

2019-07-28 00:16发布

(this question is based on that)

Let us consider the following code:

package require Tk 8.6

pack [text .t]
.t insert end "abcdefgh\nабвгґдеє\n一伊依医咿噫欹泆"

puts "[.t index 1.4+1l] [.t index 1.4+2l]"
puts "[.t index 3.4-1l] [.t index 3.4-2l]"

exit 0

Output:

2.2 3.2
2.6 1.8

I would rather expect +1l and -1l to preserve the column if the line is long enough, that is, to print 2.4 3.4 and 2.4 1.4. It looks like the result depends on the number of bytes needed to encode each character.

Should it be this way? Is it documented somewhere?

标签: unicode tcl tk
1条回答
我命由我不由天
2楼-- · 2019-07-28 00:57

What font are you using? What exact patch-version of Tk are you using? (It should be reported by doing puts [package require Tk].)

I think the text widget currently uses character widths when working out the actual motions when doing index movement by lines. This has changed between past versions. The problem is that different bits of code want different things: sometimes you want visible motions (e.g., when handling users' cursor motion, especially with tabs set) and sometimes you want character-space motions (which is what you appear to be expecting).

Tk shouldn't ever be doing anything (you can see) with the byte widths of unicode characters. It's really supposed to handle that transparently (at least for any character in the Basic Multilingual Plane; you might find bugs outside that).

查看更多
登录 后发表回答