-->

Uncompressing xml feed

2019-09-05 03:33发布

问题:

My application downloads a zipped xml file from the web and tries to create XML reader:

var fullReportUrl = "http://..."; // valid url here
//client below is an instance of HttpClient
var fullReportResponse = client.GetAsync(fullReportUrl).Result;

var zippedXmlStream = fullReportResponse.Content.ReadAsStreamAsync().Result;

XmlReader xmlReader = null;
using(var gZipStream = new GZipStream(zippedXmlStream, CompressionMode.Decompress)) 
{
    try 
    {
        xmlReader = XmlReader.Create(gZipStream, settings);
    } 
    catch (Exception xmlEx) 
    {

    }
}

When I try to create XML reader I get an error:

"The magic number in GZip header is not correct. Make sure you are passing in a GZip stream.

When I use the URL in the browser I succesfully download a zip file with a well formatted XML in it. My OS is able to unzip it without any issues. I examined the first two characters of the downloaded file and they appear to be 'PK' which is consistent with a ZIP format.

I might be missing a step in stream transformations. What am I doing wrong?

回答1:

You don't need to use GzipStream for decompressing any http response with HttpClient. You can use HttpClientHandler AutomaticDecompression to make HttpClient decompress the request automatically for you.

HttpClientHandler handler = new HttpClientHandler()
{
    // both gzip and deflate
    AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate
};

using (var client = new HttpClient(handler))
{
    var fullReportResponse = client.GetAsync(fullReportUrl).Result;
}

Edit 1:

Web Servers won't gzip output all the requests. First they check accept-encoding header, if the header is set and it is something like Accept-Encoding: deflate, gzip;q=1.0, *;q=0.5 the web server understands the client could support gzip or deflate so Web Server might ( depends on the app logic or server configuration ) compress the output into gzip or deflate. In your scenario I don't think you have set accept-encoding header so the web response will return uncompressed. Although I recommend you to try the code above.

Read more about accept-encoding on MDN