Ruby 1.9 - Invalid multibyte character (utf-8)

2019-05-31 10:15发布

I have a ruby file with only these two lines:

# encoding: utf-8
puts "—"

When I run it with ruby test_enc.rb it fails with:

test_enc.rb:2: invalid multibyte char (UTF-8)
test_enc.rb:2: unterminated string meets end of file

I don't know how to properly specify the character code of (emdash), but vim tells me it is 151, Hex 97, Octal 227. It fails the same way with other characters like ã as well, so I doubt it is related specifically to that character. I am running on Windows XP and the version of ruby I'm using is:

ruby 1.9.1p430 (2010-08-16 revision 28998) [i386-mingw32]

I feel like there is something very obvious I am missing here. Any ideas?

EDIT: Learned a valuable lesson about assumptions today - specifically assuming your editor IS using UTF-8 without actually checking it. Oops!

Thanks for the quick and accurate replies all!

EDIT AGAIN: The 'setting up vim properly for utf-8' grew too big and wasn't really relevant to this question, so it is now a separate question.

2条回答
欢心
2楼-- · 2019-05-31 10:54

Your file is in latin1. Ruby is right.

emdash would be encoded on two bytes not one in UTF-8.

查看更多
老娘就宠你
3楼-- · 2019-05-31 11:08

Given that Ruby is explicitly calling your attention to UTF-8, I strongly suspect that you haven't actually written out a UTF-8 file to start with. Make sure that Vim (or whatever text editor you're using to create the file) is really set to write out UTF-8.

Note that in UTF-8, any non-ASCII character will be represented by multiple bytes, not a single byte as you've described from the Vim diagnostics. I'd recommend using a binary file editor (or dump, or whatever) to really show what's in the text file though. Something that doesn't already have some preconceived notion of the encoding - something that isn't even trying to think of it as a text file.

Notepad lets you write out a file in UTF-8, so you might want to try that just to see what happens. (I don't have Ruby installed myself, otherwise I'd try it for you.)

查看更多
登录 后发表回答