Perquisites: hunspell
and php5
.
Test code from bash:
user@host ~/ $ echo 'sagadījās' | hunspell -d lv_LV,en_US
Hunspell 1.2.14
+ sagadīties
- works properly.
Test code (test.php):
$encoding = "lv_LV.utf-8";
setlocale(LC_CTYPE, $encoding); // test
putenv('LANG='.$encoding); // and another test
$raw_response = shell_exec("LANG=$encoding; echo 'sagadījās' | hunspell -d lv_LV,en_US");
echo $raw_response;
returns
Hunspell 1.2.14
& sagad 5 0: tagad, sagad?ties, sagaudo, sagand?, sagar?o
*
*
Screenshot (could not post code with invalid characters):
It seems that shell_exec cannot handle utf-8 correctly, or maybe some additional encoding/decoding is needed?
EDIT: I had to use en_US.utf-8 to get valid data.
Try this code:
<?php
// The word we are checking
$subject = 'sagadījās';
// We want file pointers for all 3 std streams
$descriptors = array (
0 => array("pipe", "r"), // STDIN
1 => array("pipe", "w"), // STDOUT
2 => array("pipe", "w") // STDERR
);
// An environment variable
$env = array(
'LANG' => 'lv_LV.utf-8'
);
// Try and start the process
if (!is_resource($process = proc_open('hunspell -d lv_LV,en_US', $descriptors, $pipes, NULL, $env))) {
die("Could not start Hunspell!");
}
// Put pipes into sensibly named variables
$stdIn = &$pipes[0];
$stdOut = &$pipes[1];
$stdErr = &$pipes[2];
unset($pipes);
// Write the data to the process and close the pipe
fwrite($stdIn, $subject);
fclose($stdIn);
// Display raw output
echo "STDOUT:\n";
while (!feof($stdOut)) echo fgets($stdOut);
fclose($stdOut);
// Display raw errors
echo "\n\nSTDERR:\n";
while (!feof($stdErr)) echo fgets($stdErr);
fclose($stdOut);
// Close the process pointer
proc_close($process);
?>
Don't forget to verify that the encoding of the file (and therefore the encoding of the data you are passing) actually is UTF-8 ;-)