Reading iso-8859-1 rss feed C# WP7

2019-05-09 03:32发布

问题:

I'm trying to read a rss feed which uses the iso-8859-1 encoding.

I can get all elements fine, the problem is when I put it in a textblock it will not show all characters. I'm not sure what i'm doing wrong. i've tried a few solutions I found on google but this didn't work for me. I must be missing something.. It's also the first time I really work with anything other than utf-16. I never had to convert anything before.

The app works as follows I downloadstring async(WebClient). So when that is called I get a string containing the complete rss feed.

I have tried getting the bytes, then encoding.convert.. But I must be missing something.

Like this is a sample

        WebClient RSS = new WebClient();
        RSS.Encoding = Encoding.GetEncoding("ISO-8859-1");
        RSS.DownloadStringCompleted += new         DownloadStringCompletedEventHandler(RSS_DSC);
        RSS.DownloadStringAsync(new Uri("some rss feed"));


public void RSS_DSC(object sender, DownloadStringCompletedEventArgs args)
    {

        _xml = XElement.Parse(args.Result);
        foreach(XElement item in _xml.Elements("channel").Elements("item"))
                {
                   feeditem.title = item.Element("title").Value;
                      // + all other items 

                }
    } 

I've tried this aswell

private void RSS_ORC(object sender, OpenReadCompletedEventArgs args)
    {
        Encoding e = Encoding.GetEncoding("ISO-8859-1");

        Stream ez = args.Result;

        StreamReader rdr = new StreamReader(ez, e);
        XElement _xml = _xml = XElement.Parse(rdr.ReadToEnd());
        feedlist = new List<Code.NewsItem>();

        XNamespace dc = "http://purl.org/dc/elements/1.1/";
        foreach (XElement item in _xml.Elements("channel").Elements("item"))
        {

            Code.NewsItem feeditem = new Code.NewsItem();
            feeditem.title = item.Element("title").Value;
            feeditem.description = item.Element("description").Value;
            feeditem.pubdate = item.Element("pubDate").Value;
            feeditem.author = item.Element(dc + "creator").Value;

            feedlist.Add(feeditem);
        }
        listBox1.ItemsSource = feedlist;
    }

Though titles contain characters that are not displayed well either. Like.. I can get the encoding to partially work. Instead of having these characters: the square with a question mark, a question mark or the singe square.

Don't get me wrong I'm a total beginner on this. But the solutions that has been posted on the web do not solve it for me.

Note that I removed the encoding part because it wasn't working :/ If someone would be able to help me that would be amazing.

回答1:

You can specify an encoding by setting encoding before calling client.DownloadStringAsync:

webClient.Encoding = Encoding.GetEncoding("iso-8859-1")

In your code sample you do not create the XML doc anywhere. Are some code missing? You should initialize it with something like:

var xml = XDocument.Load((string)args.Result);


回答2:

If it helps, you can use:

    var myString = HttpUtility.HtmlDecode(feeditem.description);

This way every special character will be decode, you can then display myString correctly



回答3:

Windows Phone 7 and Silverlight does not support other encodings such as ISO-8859-1, they only support ASCII and the Unicode encoders. For anything else you will need to use OpenReadAsync to get a stream of bytes then apply your own implementation of an encoding.

This blog might be helpful to you in creating one.



回答4:

ISO-8859-1 most definitely is supported in WP7. It is the only one of the ISO-8859-* encodings that is. I use an XmlReader to deserialize RSS streams and UTF-* and ISO-8859-1 are the only encodings that are supported by that class (windows-* and ISO-8859-2 and above throw exceptions in the XmlReader c'tor).

Try using an XmlReader like this (without specifying the encoding):

 using (XmlReader reader = XmlReader.Create(stream))
 {
     ...
 }

The XmlReader will get the encoding from the xml declaration in the stream.

You may still have problems displaying the upper half of the characters (above 0x80). I had this problem in feed me (my WP7 app) and used this little hack to fix things up:

    public static string EncodeHtml(string text)
    {
        if (text == null) return string.Empty;

        StringBuilder decodedText = new StringBuilder();
        foreach (char value in text)
        {
            int i = (int)value;
            if (i > 127)
            {
                decodedText.Append(string.Format("&#{0};", i));
            }
            else
            {
                decodedText.Append(value);
            }
        }
        return decodedText.ToString();
    }

It only works in a WebBrowser control of course, but that is the only place that I ever saw an incorrect display.

Hope this helps, Calum



回答5:

This worked for me when needing to decode the rss xml. It's generic enough so that it will support all encryption types supported by .NET

        WebClient wcRSSFeeds = new WebClient();
        String rssContent;

        // Support for international chars
        Encoding encoding = wcRSSFeeds.Encoding;
        if (encoding != null)
        {
            encoding = Encoding.GetEncoding(encoding.BodyName);
        }
        else
        {
            encoding = Encoding.UTF8;  // set to standard if none given 
        }
        Stream stRSSFeeds = wcRSSFeeds.OpenRead(feedURL); // feedURL is a string eg, "http://blah.com"

        using (StreamReader srRSSFeeds = new StreamReader(stRSSFeeds, encoding, false))
        {
            rssContent = srRSSFeeds.ReadToEnd();
        }