In Windows PowerShell:
echo "string" > file.txt
In Cygwin:
$ cat file.txt
:::s t r i n g
$ dos2unix file.txt
dos2unix: Skipping binary file file.txt
I want a simple "string" in the file. How do I do it? I.e., when I say cat file.txt
I need only "string" as output. I am echoing from Windows PowerShell and that cannot be changed.
PowerShell creates Unicode UTF-16 files with a Byte Order Mark (BOM).
Dos2unix 6.0 and higher can read UTF-16 files and convert them to UTF-8 (the default Cygwin encoding) and remove the BOM. Versions prior to 6.0 will see UTF-16 files as binary and skip them, as in your example.
These two commands are equivalent in that they both use UTF-16 encoding by default:
You can add an explicit encoding parameter to the latter form (as indicated by jon Z) to produce plain ASCII:
Alternately, you could use
set-content
, which uses ASCII encoding by default:Corollary 1:
Want to convert a unicode file to ASCII in one line?
Just use this:
which can be abbreviated to:
Corollary 2:
Want to get a hex dump so you can really see what is unicode and what is ASCII?
Use the clean and simple Get-HexDump function available on PowerShell.com. With that in place you can examine your generated files with just:
For anything non-trivial, you can specify how many columns wide you want the output and how many bytes of the file to process with something like this:
Try
echo "string" | out-file -encoding ASCII file.txt
to get a simple ASCII-encoded txt file.Comparison of the files produced:
will produce a file with the following contents:
however
will produce a file with the following contents:
(Byte order mark FF FE indicates the file is UTF-16 (LE). The signature for UTF-16 (LE) = 2 bytes: 0xFF 0xFE followed by 2 byte pairs. xx 00 xx 00 xx 00 for normal 0-127 ASCII chars