What text-encoding should I use for my CURL to a g

2019-08-21 07:20发布

问题:

Google spreadsheets have a possibility to create forms that allows you to post data to the spreadsheet. This is not done via the Google API.

I'm using the following code to post data to the form:

<?php
$ch = curl_init();
curl_setopt ($ch, CURLOPT_URL,$googleformURL);
curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt ($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6");
curl_setopt ($ch, CURLOPT_TIMEOUT, 60);
curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($ch, CURLOPT_POSTFIELDS, $postdata);
curl_setopt ($ch, CURLOPT_POST, 1);
$data = curl_exec ($ch);
curl_close($ch);
//echo $data;
//Redirect to your thank you page
header( "Location: $thankyou" ) ;
?>

This works fine, however, when I look at the data in Google Spreadsheets, all special characters, such as ÅÄÖ are missing. I'm guessing this is a encoding error. How should I change the code so that this works? Below is a sample of an original form.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"><html><head><link rel="shortcut icon" href="//ssl.gstatic.com/docs/spreadsheets/forms/favicon_jfk2.png" type="image/x-icon">
<meta http-equiv="Content-type" content="text/html; charset=utf-8">
<title>Untitled form</title>

<link href='/static/spreadsheets/client/css/779923916-published_form_compiled.css' type='text/css' rel='stylesheet'>
</head>
<body class="ss-base-body" dir="ltr" itemscope itemtype="http://schema.org/CreativeWork/FormObject"><meta itemprop="name" content="Untitled form">



<meta itemprop="embedUrl" content="https://docs.google.com/spreadsheet/embeddedform?formkey=dGJ3ZTdNQ0xwQUJKeGp0dVpDbElVTHc6MQ">
<meta itemprop="faviconUrl" content="//ssl.gstatic.com/docs/spreadsheets/forms/favicon_jfk2.png">

<div class="ss-form-container">
<div class="ss-form-heading"><h1 class="ss-form-title">Untitled form</h1>
<p></p>


<hr class="ss-email-break" style="display:none;">
</div>
<div class="ss-form"><form action="https://docs.google.com/spreadsheet/formResponse?formkey=dGJ3ZTdNQ0xwQUJKeGp0dVpDbElVTHc6MQ&amp;ifq" method="POST" id="ss-form">


<br>
<div class="errorbox-good">
<div class="ss-item  ss-text"><div class="ss-form-entry"><label class="ss-q-title" for="entry_0">Namn
</label>
<label class="ss-q-help" for="entry_0"></label>
<input type="text" name="entry.0.single" value="" class="ss-q-short" id="entry_0"></div></div></div>
<br> <div class="errorbox-good">
<div class="ss-item  ss-text"><div class="ss-form-entry"><label class="ss-q-title" for="entry_1">Gatuadress
</label>
<label class="ss-q-help" for="entry_1"></label>
<input type="text" name="entry.1.single" value="" class="ss-q-short" id="entry_1"></div></div></div>
<br> <div class="errorbox-good">
<div class="ss-item  ss-text"><div class="ss-form-entry"><label class="ss-q-title" for="entry_2">Postnummer
</label>
<label class="ss-q-help" for="entry_2"></label>
<input type="text" name="entry.2.single" value="" class="ss-q-short" id="entry_2"></div></div></div>
<br> <div class="errorbox-good">
<div class="ss-item  ss-text"><div class="ss-form-entry"><label class="ss-q-title" for="entry_3">Ort
</label>
<label class="ss-q-help" for="entry_3"></label>
<input type="text" name="entry.3.single" value="" class="ss-q-short" id="entry_3"></div></div></div>
<br> <div class="errorbox-good">
<div class="ss-item  ss-text"><div class="ss-form-entry"><label class="ss-q-title" for="entry_4">E-post
</label>
<label class="ss-q-help" for="entry_4"></label>
<input type="text" name="entry.4.single" value="" class="ss-q-short" id="entry_4"></div></div></div>
<br> <div class="errorbox-good">
<div class="ss-item  ss-text"><div class="ss-form-entry"><label class="ss-q-title" for="entry_5">Sort
</label>
<label class="ss-q-help" for="entry_5"></label>
<input type="text" name="entry.5.single" value="" class="ss-q-short" id="entry_5"></div></div></div>
<br> <div class="errorbox-good">
<div class="ss-item  ss-paragraph-text"><div class="ss-form-entry"><label class="ss-q-title" for="entry_7">Story
</label>
<label class="ss-q-help" for="entry_7"></label>
<textarea name="entry.7.single" rows="8" cols="75" class="ss-q-long" id="entry_7"></textarea></div></div></div>
<br>
<input type="hidden" name="pageNumber" value="0">
<input type="hidden" name="backupCache" value="">


<div class="ss-item ss-navigate"><div class="ss-form-entry">
<input type="submit" name="submit" value="Submit"></div></div></form>
<script type="text/javascript">

      (function() {
var divs = document.getElementById('ss-form').
getElementsByTagName('div');
var numDivs = divs.length;
for (var j = 0; j < numDivs; j++) {
if (divs[j].className == 'errorbox-bad') {
divs[j].lastChild.firstChild.lastChild.focus();
return;
}
}
for (var i = 0; i < numDivs; i++) {
var div = divs[i];
if (div.className == 'ss-form-entry' &&
div.firstChild &&
div.firstChild.className == 'ss-q-title') {
div.lastChild.focus();
return;
}
}
})();
      </script></div>
<div class="ss-footer"><div class="ss-attribution"></div>
<div class="ss-legal"><span class="ss-powered-by">Powered by <a href="http://docs.google.com">Google Docs</a></span>
<span class="ss-terms"><small><a href="https://docs.google.com/spreadsheet/reportabuse?formkey=dGJ3ZTdNQ0xwQUJKeGp0dVpDbElVTHc6MQ&amp;source=https://docs.google.com/spreadsheet/viewform?formkey%3DdGJ3ZTdNQ0xwQUJKeGp0dVpDbElVTHc6MQ">Report Abuse</a>
-
<a href="http://www.google.com/accounts/TOS">Terms of Service</a>
-
<a href="http://www.google.com/google-d-s/terms.html">Additional Terms</a></small></span></div></div></div></body></html>

回答1:

If you want to emulate a POST request that the html form makes you need to ensure:

  • Content-Type header in the request is set to application/x-www-form-urlencoded. This is already done by CURL automatically since you set CURLOPT_POST=1
  • Your post-data is in the key1=value&key2=value2 format, where keys and values are url encoded

In PHP, just calling urlencode to the key and value strings (do not urlencode the delimiters & and =, when they are acting as a delimiter - just the keys and values) is enough provided that they are physically UTF-8 strings.

The physical encoding of a PHP string depends entirely on where it's attained from. If you write the string literally in the source file, then the encoding depends on the PHP source file's encoding which is set in the settings of the text editor that was used to save the file. If the string is retrieved from a database such as MySQL, it depends on the connection encoding.

You can build the postdata string from an array like so:

function arrayToPostData($data) {
    $ret = array();

    foreach( $data as $key => $value ) {
        $ret[] = urlencode($key) . "=" . urlencode($value);
    }

    return implode("&", $ret );
}

Where you give it an array of key value pairs in UTF-8 encoding:

arrayToPostData(array(
        "ö" => "ä",
        "å" => "aaa"
));

If the strings in the array were physically UTF-8, you will see the correct result:

%C3%B6=%C3%A4&%C3%A5=aaa

Which is what you pass to curl as postdata