How to handle accents in Common Lisp (SBCL)?

2019-02-25 00:57发布

That's probably very basic, but I didn't know where else to ask. I'm trying to process some text information in an SLIME REPL from a file that are written in Portuguese, hence uses lots of accents characters - such as é, á, ô, etc..

When I'm handling texts in English I use the following function:

(defun txt2list (name)
  (with-open-file (in name)
      (let ((res))
        (do ((line (read-line in nil nil)
                   (read-line in nil nil)))
        ((null line)
         (reverse res))
      (push line res))
    res)))

that cannot read accented characters, giving the error "the octet sequence #(195) cannot be decoded.".

So my question is: Is there a way to manipulate those characters automatically? It's okay to replace those characters for the letter without the accent ('á' turns into 'a') or simply deleting such characters ('cômodo' turns into 'cmodo'), whether it is done in the file itself before reading or during the reading process.

1条回答
The star\"
2楼-- · 2019-02-25 01:49

You would need to find out what text encoding is used for the file. Then tell WITH-OPEN-FILE to use the correct one.

See the SBCL manual: External Formats

Example:

 (with-open-file (stream pathname :external-format '(:utf-8 :replacement #\?))
   (read-line stream))
查看更多
登录 后发表回答