We have a project in Team Foundation Server (TFS) that has a non-English character (š) in it. When trying to script a few build-related things we've stumbled upon a problem - we can't pass the š letter to the command-line tools. The command prompt or what not else messes it up, and the tf.exe utility can't find the specified project.
I've tried different formats for the .bat file (ANSI, UTF-8 with and without BOM) as well as scripting it in JavaScript (which is Unicode inherently) - but no luck. How do I execute a program and pass it a Unicode command line?
I got around a similar issue deleting Unicode-named files by referring to them in the batch file by their short (8 dot 3) names.
The short names can be viewed by doing
dir /x
. Obviously, this only works with Unicode file names that are already known.This problem is quite annoying. I usually have Chinese character in my filename and file content. Please note that I am using Windows 10, here is my solution:
To display the file name, such as
dir
orls
if you installed Ubuntu bash on Windows 10Set the region to support non-utf 8 character.
After that, console's font will be changed to the font of that locale, and it also changes the encoding of the console.
After you have done previous steps, in order to display the file content of a UTF-8 file using command line tool
chcp 65001
type
command to peek the file content, orcat
if you installed Ubuntu bash on Windows 10The laziest solution: Just use a console emulator such as http://cmder.net/
Check the language for non-Unicode programs. If you have problems with Russian in the Windows console, then you should set Russian here:
Try:
which will change the code page to UTF-8. Also, you need to use Lucida console fonts.
A better cleaner thing to do: Just install the available, free, Microsoft Japanese language pack. (Other oriental language packs will also work, but I have tested the Japanese one.)
This gives you the fonts with the larger sets of glyphs, makes them the default behavior, changes the various Windows tools like cmd, WordPad, etc.
I see several answers here, but they don't seem to address the question - the user wants to get Unicode input from the command line.
Windows uses UTF-16 for encoding in two byte strings, so you need to get these from the OS in your program. There are two ways to do this -
1) Microsoft has an extension that allows main to take a wide character array: int wmain(int argc, wchar_t *argv[]); https://msdn.microsoft.com/en-us/library/6wd819wh.aspx
2) Call the windows api to get the unicode version of the command line wchar_t win_argv = (wchar_t)CommandLineToArgvW(GetCommandLineW(), &nargs); https://docs.microsoft.com/en-us/windows/desktop/api/shellapi/nf-shellapi-commandlinetoargvw
Read this: http://utf8everywhere.org for detailed info, particularly if you are supporting other operating systems.