os x 10.6.8 - Cannot input non-ASCII/UTF-8 chars (

2019-08-30 09:51发布

问题:

Running OS X 10.6.8 Snow Leopard I cannot input the scandinavian letters into the interpretive mode. The terminal bell sounds for every keystroke and nothing shows up. All letters show up as normal in the regular terminal environment. Inputting UTF8 characters works fine in the Terminal, when running a python script, in PyDev and in the REPL

Is there a problem with the interactive mode settings and these special characters?

I have installed and am running python 2.7.3 mainly, but the OS-provided Pythons have this problem too. (i.e when running python2.5 or python2.6 I still experience this problem.) I dont know if installing python 2.7 has changed some under library that it uses, maybe readline (I'm on thin ice here, guessing basically)?

回答1:

It sounds like the problem here is that the python.org Python is expecting real readline, and not being happy with the libedit substitute that Apple provides.

See the documentation for readline at PyPI for an explanation of the issue.

You can fix it as follows:

sudo /path/to/easy_install readline

Note that readline is one of the handful of things that cannot be installed properly by pip, so you have to use easy_install (or do it manually).

The python.org 2.x installers don't come with easy_install. Install it by following the directions on the setuptools page.

On top of that, keep in mind that, in some cases, you can end up with Apple-python easy_install in /usr/local/bin as well as /usr/bin, which means you can't be sure /usr/local/bin/easy_install will get the python.org version, so explicitly use easy_install-X.Y.

And even that doesn't help if you're using a python.org (or other) installation of an X.Y version that Apple already gave you. /usr/local/bin/easy_install-2.7 may well be Apple's (as it is on the 10.8.2 machine I'm sitting at right now). The only way to be safe is to check the shebang line and see which Python interpreter it uses.

Or, more simply, just don't install a python-X.Y if Apple's already given you one. Seriously, there are hundreds of questions all over SO from people who did this and have problems, and all of them could be avoided by just using the Apple build. Apple used to ship broken, incomplete, and/or woefully out-of-date Python, but since either 10.5 or 10.6, they've been shipping working, complete, reasonably-recent versions, with extras like easy_install and PyObjC included.



回答2:

Since you're using 2.7.3, which 10.6 doesn't come with, you've obviously installed some third-party Python.

If you look at the download page for Python 2.7.3 for the "Mac OS X 64-bit/32-bit x86-64/i386 Installer (2.7.3) for Mac OS X 10.6 and 10.7" installer, it says:

You may need an updated Tcl/Tk install to run IDLE or use Tkinter, see note 2 for instructions.

Under Note 2:

There is important information about IDLE, Tkinter, and Tcl/Tk on Mac OS X here. Also, on Mac OS X 10.6, if you need to build C extension modules with the 32-bit-only Python installed, you will need Apple Xcode 3, not 4. The 64-bit/32-bit Python can use either Xcode 3 or Xcode 4.

If you follow the link, it explains the problems with the version of Tcl/Tk that came with 10.6. Note that in the chart below, Apple 8.5.7 is specifically not recommended.

If you want to use IDLE with a non-Apple Python on 10.6, the chart recommends installing ActiveTcl 8.5.13.

The page doesn't explain exactly what the problems are, but if I remember correctly, Apple's Tk used to crash whenever TkInter received a non-ASCII character in certain circumstances, and the best workaround they could come up with for IDLE was to just reject those characters, exactly as you're seeing.

If you're using a different Python 2.7.3 (Enthought, ActiveState, Homebrew, MacPorts, hand-built, etc.), they mostly don't have thorough documentation on this problem, but the same fix will probably work.

I believe 10.6 is also when Apple started shipping reasonably modern Python versions and a working IDLE, so you might want to just use that instead of a third-party Python in the first place. (However, I might be misremembering, and that might only be true with 10.7 and later.)



回答3:

I have nearly same probelems discussed at "How to define/declare utf-8 code points for Turkish special chars (non-ascii) to use them as standart utf-8 encoding?". As far as I have understood so far, the problem is due to inefficient definitions in unicode and utf-8. Declarations in unicode and utf-8 are based on displaying fonts for charcters (extending accented, non-standart ones). That might have been satisfactory in old days but today's programming requirements are far ahead of ascii based ansi standarts and current cahrcode declarations for many languages (having ascii based extend charset alphabets) have problems in handling encoding, transformation and testing. You can find more deatils in notes under my question. utf-8 was designed to be versionless but I am afraid non-English latin alphabets needs to be redeclared in a new version of utf-8 so that giving a full range for each alphabet while keeping same fonts for same chars. In my theory every alphabet will have it's own A with relatively different charcodes and code-points but all A(s) will point to the same font-code. So while displaying A in any languge is using font-code every A will be less than any char in it's alphabet but ascii z will never be less than ŞşÇçÖö or any accented char....