CURL doesn't encode UTF-8

2019-06-25 13:25发布

问题:

I'm using Windows 10 and curl 7.52.1. When I try to POST data to a WEBSERVICE, curl isn't encoding the characters to UTF-8 (I need to display pt-BR characters, like àáçÇãõ etc)

Yes, I have already checked this, no success.

If I set the encoding page to chcp 65001, the error persists. Changing to chcp 1252 solved the problem partially.

Look, if I prompt echo Administração >> test.txt without any chcp change, I get an Administra‡Æo.

After change to chcp 65001 I get Administração.

After change to chcp 1252 I finally get Administração.

But using curl, nothing change.

I've tried setting a header content-type, no lucky:

curl -X POST -h "Content-Type: text/plain; charset=UTF-8" --data-ascii "name=Administração" http//:localhost:8084/ws/departments

I get the following output:

{"holder":{"entities":[{"name":"Administra��o","dateReg":"Dec 29, 2016 2:05:33 PM"}],"sm":{}},"message":{"text":""},"status":-1}

I have also checked the WS it's accepting the characters encoding, when I run (in JQuery):

$.ajax({
     url:"http://localhost:8084/ws/departments",
     type:"POST",
     data: {name: "Administração"},
     success: function(data, textStatus, xhr){
       console.log(data);
     }
});

I get the output expected:

{"holder":{"entities":[{"name":"Administração","dateReg":"Dec 29, 2016 2:03:17 PM"}],"sm":{}},"message":{"text":""},"status":-1}

I don't know what else can I try to solve this. Please, could you guys help me?

Thanks in advance.

UPDATE

As suggested by @Dekel, I tried also using an external file as data-bynary (the content inside test.txt is name=Administração):

curl -i -X POST -H "Content-Type: text/plain; charset=UTF-8" --data-binary "@test.txt" http://localhost:8084/ws/departments

I still get this unusual output:

**{"holder":{"entities":[{"name":"Administra��o","dateReg":"Dec 29, 2016 2:41:27 PM"}],"sm":{}},"message":{"text":""},"status":-1}**

UPDATE 2

@Phylogenesis suggested to use charset=ISO-8859-1. I noticed that even returning Administração as result, checking narrowly in the server-side, the WS is receiving the exact letter, in this case ç.

回答1:

After a discussion with @Dekel and a suggestion coming from @Phylogenesis I could resolve the problem partially, but effectively. There are 2 ways:

  • charset=ISO-8859-1
  • encoding a file and sending as binary-data

The server could receive the correct letter using charset=ISO-8859-1. Even the response data from server showing incorrectly.

I use: curl -i -X POST -H "Content-Type: text/plain; charset=ISO-8859-1" --data-ascii "name=Administração" http://localhost:8084/ws/departments

The second way is encoding a file containing all the content you want to POST. I used Notepad++ > Format > Convert to UTF-8 (Without BOM).

Then, prompt: curl -i -X POST -H "Content-Type: text/plain; charset=UTF-8" --data-binary "@test.txt" http://localhost:8084/ws/departments.