I have an application which process the vbscript and produces the output.
private static string processVB(string command, string arguments)
{
Process Proc = new Process();
Proc.StartInfo.UseShellExecute = false;
Proc.StartInfo.RedirectStandardOutput = true;
Proc.StartInfo.RedirectStandardError = true;
Proc.StartInfo.RedirectStandardInput = true;
Proc.StartInfo.StandardOutputEncoding = Encoding.UTF8;
Proc.StartInfo.StandardErrorEncoding = Encoding.UTF8;
Proc.StartInfo.FileName = command;
Proc.StartInfo.Arguments = arguments;
Proc.StartInfo.WindowStyle = ProcessWindowStyle.Hidden; //prevent console window from popping up
Proc.Start();
string output = Proc.StandardOutput.ReadToEnd();
string error = Proc.StandardError.ReadToEnd();
if (String.IsNullOrEmpty(output) && !String.IsNullOrEmpty(error))
{
output = error;
}
//Console.Write(ping_output);
Proc.WaitForExit();
Proc.Close();
return output;
}
I think I have set everything related to Encoding property correct. processVB method will get command as VBscript file and its arguments.
The C# method processVB which is processing that VBScript file now producing the output as follows.
"����?"
But I should get original text
"äåéö€"
I have set Encoding correctly. But I am not able to get it right.
What am I doing wrong?
The other process (vbscript) generates and output in some encoding. By setting the StandardOutputEncoding you tell the system how to read that stream. This will not change the Encoding made by the other process.
So you need to figure out the exact encoding used by the other process (VBScript). For that I'd run the script directly from the shell and redirect the output to a file and open it in an tool that shows the encoding (i.e. notepad2) And if I'm right that would be something other than UTF8.
Then you set the Proc.StartInfo.StandardOutputEncoding to that encoding in your code and then everything should work.
This answer is not answering direct question - but I noticed a deadlock potential in your code and thus thought it would be worthy to post it anyhow.
The deadlock potential exists due to your code trying to do synchronous read from redirected output, and doing it for both, StdOut and StdErr. I.e. this section of the code.
What can happen is that child process writes a lot of data to StdErr and filling up the buffer. Once buffer gets filled up, the child process will block on the write to StdErr (without signaling yet end of StdOut stream). And so child is blocked and not doing anything, and your process is blocked waiting for child to exit. Deadlock!!!
To fix this, at least one (or better both) streams should be switched to asynchronous mode.
See second example in MSDN that talk specifically about this case scenario, and how to switch to asynchronous mode.
As for the
UTF-8
issue, are you sure that your child process is outputting in this encoding and not say inUTF-16
or some other one? You may want to examine the bytes to try to reverse out what encoding stream is supplied in so you can set proper encoding for interpreting redirected stream.EDIT
Here is how I think you can resolve the encoding issue. The basic idea is based on something that I once needed to do - I had Russian text in unknown encoding, and needed to figure out how to convert it so it shows proper characters - take the bytes captured from StdOut, and try to decode them using all known code pages available on the system. The one that looks right is likely (but not necessarily) the encoding that StdOut is encoded with. The reason it is not guaranteed to be the one even if it looks correct with your data is because many encoding have overlap over some ranges of bytes that would make it work the same. E.g. ASCII and UTF8 would have the same bytes when encoding basic Latin characters. So to get exact match, you may need to get creative and test with some atypical text.
Here is the basic code to do it - adjustments may be necessary:
Run the code and manually examine the output. All those that match the expected text are candidates for being the encoding used in StdOut.
I am using your function like this:
And my vbs file is
My vbs file is encoded as UTF-8 without BOM
And it works as expected. I see
äåéö€
on my form.Maybe you should change the way how you use your function, encoding of your vbs file and how you output data to stdout.
The problem is that the console isn't UTF-8 by default. It runs in the same code page as your locale settings in Windows. A simple way to solve this is by using the
chcp
console command. Example:This will cause the output to be in UTF-8 and ensure that you can read it properly from your .NET application.
Note that I've tested this with a
bat
script instead of VB-script, but if VB-script does support UTF-8, it should work just fine. Also, you may have to explicitly call the VB-script execution engine instead of justyourScript.vbs
. But you should be able to resolve this easily on your own :)That's the assumption that is getting you in trouble here, it just isn't utf-8. Nor can it be, the scripting engine doesn't support setting it. Something you can try for yourself, use this statement in a sample .vbs file:
Kaboom, it only accepts LCID values and they don't cover utf encodings. Instead, the cscript.exe scripting engine already changes the default code page itself. Instead of the default OEM code page (HKEY_LOCAL_MACHINE\SYSTEM\ControlSet\Control\Nls\CodePage\OEMCP value), it switches to the default Windows code page. The ACP value in the above documented registry key. Depends on your location, it will be 1252 for example in the Americas and Western Europe.
Some VBScript code to play with, be sure to save the file with the default encoding that's appropriate for your locale or the script interpreter itself will mis-interpret the strings in the source code. Which in itself can explain your problem as well:
Output on my machine:
So the proper line of code in your program should be:
Do note that this is not the default that the Process class uses, it will assume that a console mode program uses the OEM code page. Like 437 on a machine in Northern America and Western Europe. You can pick another LCID in your .vbs program and change your C# code to match but that should not be necessary.
And do keep the failure mode of having the .vbs source code file encoded wrong in mind. The scripting engine doesn't support utf-8 with a BOM either, unfortunately.