Ruby: How to break a potentially unicode string in

2019-07-29 03:27发布

问题:

I'm writing a game which is taking user input and rendering it on-screen. The engine I'm using for this is entirely unicode-friendly, so I'd like to keep that if at all possible. The problem is that the rendering loop looks like this:

"string".each_byte do |c|
    render_this_letter(c)
end

I don't know a whole lot about i18n, but I know enough to know the above code is only ever going to work for me and people who speak my language. I'd prefer something like:

"unicode string".each_unicode_letter do |u|
    render_unicode_letter(u)
end

Does this exist in the core distribution? I'm somewhat averse to adding additional requirements to the install, but if it's the only way to do it, I'll live.

For extra fun, I have no way of knowing if the string is, in fact, a unicode string.

EDIT: The library I'm using can indeed render entire strings, however I'm letting the user edit what comes up on the fly - if they hit 'backspace', essentially, I need to know how many bytes to chop off the end.

回答1:

Unfortunately ruby 1.8.x has poor unicode support. It's being addressed in 1.9. But in the mean time, libraries like this one (http://snippets.dzone.com/posts/show/4527) are a good solution. Using the linked library, your code would look something like this:

"unicode_string".each_utf8_char do |u| 
    render_unicode_letter(u)
end


回答2:

You could try including the ActiveSupport::CoreExtensions::String::Unicode module from the rails codebase.