PHP: RFC-2231 How to encode UTF-8 String as Conten

2019-01-28 03:37发布

Scenario: (in PHP) I have a form submission with a UTF-8 encoded string ($name) to support international characters. Upon submitting the form (via GET), I am creating a CSV download file. I want the name of the file to be that string + .csv ("$name.csv"). For a western character set I can do this just fine by doing:

header("Content-Disposition: attachment; filename=\"$name\"");

But for other character sets, the download file's name is garbage letters + .csv (such as ×œ×œ× ×›×•×ª×¨×ª.csv). I am trying to follow RFC 2231 to do something like:

header("Content-Disposition: attachment; filename*=UTF-8''$name");

But I seem to have a couple problems:

  1. Browser seems to ignore the "filename" part of the header. Is my format right?
  2. I need to encode each character of $name octets encoded in hexadecimal, like "This%20is%20%2A%2A%2Afun%2A%2A%2A". Does anyone have a function to do this properly? I coded the following but I don't think it is right:

    $fileName = encodeWordRfc2231($name) . ".csv";
    header("Content-Disposition: attachment; filename*=UTF-8''$fileName");
    
    function &encodeWordRfc2231($word) {
        $binArray = unpack("C*", $word);
        foreach ($binArray as $chr) {
            $hex_ary[] = '%' . sprintf("%02X", base_convert($chr, 2, 16));
        }
        return implode('', $hex_ary);
    }
    

Does anyone out there have experience with this and can set me on the right path?

1条回答
beautiful°
2楼-- · 2019-01-28 04:02

It is enough to encode the file name according to RFC 3986 by using rawurlencode()

So all you need to do is change the header() line to:

header("Content-Disposition: attachment; filename*=UTF-8''".rawurlencode($name));

To answer the questions directly:

  1. The format is right but the text inside $name needs to be encoded with rawurlencode().
  2. rawurlencode() does the trick.
查看更多
登录 后发表回答