I get a base64 encoded string which represents a rtf-file.
If I look the original text representation (before base64 encode) I see the character sequence F¸r
. This should stand for Für
, when displayed in a viewer. The header of the rtf-file contains ansicpg1252
so this should be the encoding except otherwise changed (escape sequences, font definitions, ..).
My problem now is that I can't correctly decode the base 64 string to its original representation. I never get F¸r
anymore. Instead I have Für
or even F\'fcr
. Through this the representation of the umlaut is wrong when displaying the decoded rtf in a viewer.
So what is the original encoding of the rtf-file? Or what is going wrong here?
You can have a look into a sample file here. This is the base 64 encoded string I get.
Edit:
I don't have the code for the encoding, but I think I can reconstruct that. This is my code for this:
string path = "/some/path/ltxt1 Kopie.rtf";
byte[] document = File.ReadAllBytes(path);
string base64string = Convert.ToBase64String(document);
var isoBytes = Convert.FromBase64String(base64string);
File.WriteAllText ("/some/path/sketch.rtf", System.Text.Encoding.GetEncoding("iso-8859-1").GetString(isoBytes));
I tried to change the encoding, but with windows-1252
I get an error (sketch: encoding name not supported, real project: array not null).