I am looking for a (simple) text editor that can handle text in different encodings in the same document.
I need to develop some sites with mixed Japanese and English text and the editors I have now (on an English Windows system) are unable to display the Japanese text. Jedit files don't display the Japanese text I have inputted but when I look at the file in a browser it shows up correctly. Gvim shows all Japanese text in the editor as question marks and also in the browser. In Gvim inputting the kanji works (you input the pronounciation and then press space bar to get the kanji) but when you confirm the kanji you want it replaces that kanji with question marks. (1 question mark for every kanji).
Can someone recommend me a text editor to edit html and php files that is able to display utf-8 encoded text and also save as an utf-8 file ?
thank you.
After reading about emacs I installed it. see below.
Thanks everybody for the hints. if you don't have a unicode font yet you have to find one online or buy one. here are the instructions to install the font on a windows system http://support.microsoft.com/kb/314960
jEdit I changed my font in Jedit to a UTF font and now the Japanese shows up normally. inputting the Japanese is still problematic as you don't see what you are typing. (to change your font to edit files go to Utilities -> Global Options -> text area select a Unicode font and you'll be able to see the Japanese characters.
gVim I am still trying to figure out how to add a font in gvim. Once I know how to do that I ll update this.
Emacs Emacs does not show the kanji correctly, they are displayed as ??? but at least I can see what I type in Japanese and select the right word.
so at this point I have to say that in jEdit I can see Japanese text but I can't input Japanese text. Gvim I can input Japanese text but inside the text area it is displayed as ??? and the same goes for Emacs. adding a font in emacs and gvim is sadly enough not a trivial task. At the moment I use notepad with the Arial unicode MS font and saving as UTF-8 file as my Japanese editor. Not ideal but at least it works.
For japanese, Sakura Editor is exceptional. It can display UTF-8, EUC-JP, SJIS and so on.
You can use just Notepad.exe with the "Arial Unicode MS" font (if all of your text is left-to-right, given the English windows version). Just Save as, select UTF-8.
In general, use your favourite editor with a font like "Arial Unicode MS". I mention this one because is the font with the greatest Unicode coverage I have seen,
Try BabelPad. Editing-wise, it's simple. Unicode-support-wise, it's awesome!
TextPad is a good utility too. It's a trialware, but does the job fine. See how to set char-encoding-setting-in-textpad.
I would recommend Vim still. The problem you were seeing with questions marks is probably an issue with the font you were using. When displaying text that contains characters not in the currently language applications typically display them as empty boxes or question marks. See here for UTF-8 support in Vim.
This section of the Vim manual is also helpful, especially for setting up UTF-8 in Windows.
There is an issue with most Unicode-aware text editors: when you select a font, they stick to it. If the font does not include a glyph for a character, then the default substitution character (I believe U+FFFD, REPLACEMENT CHARACTER) is used.
In contrast, web browsers typically try to find a glyph for the characters they have to display among all the fonts provided by the system.
So, what you need, if you don't have the font "Arial Unicode MS" or similar (including Japanese glyphs), is an editor that tries to match glyphs with other fonts except the selected one.
Until someone provides a link for such an editor, I'll suggest a (somewhat extreme :) editor:
The "idle" editor is typically used to edit python code (and test it interactively in the Python shell). However, it can be used as a plain fully-Unicode-aware text editor, and when saving text including non-ASCII chars, it defaults to UTF-8 encoding.
Now, idle is based on Tkinter, which is an interface to tk, which is a gui library for tcl; tcl/tk, like web browsers, when asked to display a character for which no glyph is present in the widget font, it searches other fonts too.
However far-fetched this may seem, I really believe it would help; if no other solution helps you, give it a try.