Why chunk is returning some code while reading the

2019-08-29 23:45发布

问题:

Further to my old question, We are generating XML using following code:

download_xml('GET', [])->
    Xml =generateXML(123445),
    %% generated Xml  data in string without any values 400,.etc
    Filename = export_xml:get_file_name(?SESSION_ID1, ?SESSION_ID2), 
    Filepath = "./priv/static/" ++ Filename,
    TotalSize = filelib:file_size(Filepath),
    {ok, FP} = file:open(Filepath, [read]),
    Generator = fun(FH) -> 
                        case file:read(FH, 1024) of %% But this line is causing something that we never wanted.
                            eof -> file:close(FH), 
                                   done; 
                            {ok, Data} -> 
                                {output, Data, FH} 
                        end 
                end,
{stream, Generator, FP, [
                             {"Content-Type", "application/force-download"},
                             {"Content-Disposition", "attachment; filename=" ++ Filename},
                             {"Content-length", TotalSize}
                            ]}.

We are reading files in chunks using file:read(FH, 1024) by this line. But this line is also appending some numbers 400, 400, 3b2 in each chunk. We have observed that those codes are nothing but the Hexadecimal values for each chunk. Here is the sample XML :

sample.xml

400
<?xml version="1.0" encoding="UTF-8"?>.....</info><inf
400
tel>4444</tel>...<address></address>
3b2
<name> Abc</name><surname>EFg</surname><city>XYZ</city>....
</DATA>
0

Since, on changing the chunk size to 2048 from 1024 (i.e file:read(FH, 2048)) values also get changed to 808, 365, 0.

What we're not understanding is: - While streaming the file contents in chunks, each chunk is appending, it's (chunk's) size in the XML and then actual chunk is inserted.

Here is small XML wanted to generate has size (93 Bytes):

<?xml version="1.0">
<info>
<name> Abc</name>
<surname>EFg</surname>
<city>XYZ</city>
</info>

After generating we get the output as:

5d
<?xml version="1.0">
<info>
<name> Abc</name>
<surname>EFg</surname>
<city>XYZ</city>
</info>
0 

5d = 93 (Chunk size) In this case file size.

The Question is:

  • Why chunk is appending size before each chunk while streaming the file with Generator?

NOTE - We also tried removing header list {"Content-length", TotalSize} from the code, but did not work :(

回答1:

I saw an exchange in the erlang bug mailing list which looks related to your problem : Misleading docs or implementation of file:read/2 and friends

It seems that the usage of file:read/2 is not 100% clean with the utf8 option. They recommend to use instead io:read/3, but I don't see how to deal with chunks and potential new line.