Reading the binary output of an external program i

2020-06-19 06:16发布

问题:

I'm trying to run an external program in SBCL and capture its output. The output is binary data (a png image), while SBCL insists on interpreting it as strings.

I tried a number of ways, like

(trivial-shell:shell-command "/path/to/png-generator" :input "some input")

(with-input-from-string (input "some input")
  (with-output-to-string (output)
    (run-program "/path/to/png-generator" () :input input :output output))


(with-input-from-string (input "some input")
  (flexi-streams:with-output-to-sequence (output)
    (run-program "/path/to/png-generator" () :input input :output output))

But I get errors like

Illegal :UTF-8 character starting at byte position 0.

It seems to me that SBCL is trying to interpret the binary data as a text and decode it. How do I change this behaviour ? I'm interested only in obtaining a vector of octets.

Edit: Since it is not clear from the text above, I'd like to add that at least in the case of flexi-stream, the element-type of the stream is a flexi-streams:octect (which is a (unsigned-byte 8)). I would expect at least in this case run-program to read the raw bytes without many issues. Instead I get a message like Don't know how to copy to stream of element-type (UNSIGNED-BYTE 8)

回答1:

Edit: I got angry at not being able to do this very simple task and solved the problem.

Functionally, the ability to send a stream of type UNSIGNED-BYTE into run-program and have it work correctly is severely limited, for reasons I don't understand. I tried gray streams, flexi-streams, fd streams, and a few other mechanisms, like you.

However, perusing run-program's source (for the fifth or sixth time), I noticed that there's an option :STREAM you can pass to output. Given that, I wondered if read-byte would work... and it did. For more performant work, one could determine how to get the length of a non-file stream and run READ-SEQUENCE on it.

(let* 
       ;; Get random bytes
      ((proc-var (sb-ext:run-program "head" '("-c" "10" "/dev/urandom")
                                     :search t
       ;; let SBCL figure out the storage type. This is what solved the problem.
                                     :output :stream))
       ;; Obtain the streams from the process object.
       (output (process-output proc-var))
       (err (process-error proc-var)))
  (values
   ;;return both stdout and stderr, just for polish.
   ;; do a byte read and turn it into a vector.
   (concatenate 'vector
                ;; A byte with value 0 is *not* value nil. Yay for Lisp!
                (loop for byte = (read-byte output nil)
                   while byte
                   collect byte))
   ;; repeat for stderr
   (concatenate 'vector
                (loop for byte = (read-byte err nil)
                   while byte
                   collect byte))))


回答2:

If you're willing to use some external libraries, this can be done with babel-streams. This is a function I use to safely get content from a program. I use :latin-1 because it maps the first 256 bytes just to the characters. You could remove the octets-to-string and have the vector.

If you wanted stderr as well, you could use nested 'with-output-to-sequence' to get both.

(defun safe-shell (command &rest args)                                                                                                           
  (octets-to-string                                                                                                                              
   (with-output-to-sequence (stream :external-format :latin-1)                                                                                   
     (let ((proc (sb-ext:run-program command args :search t :wait t :output stream)))                                                            
       (case (sb-ext:process-status proc)                                                                                                        
         (:exited (unless (zerop (sb-ext:process-exit-code proc))                                                                                
                    (error "Error in command")))                                                                                                 
         (t (error "Unable to terminate process")))))                                                                                            
   :encoding :latin-1))                                                                                                                          


回答3:

Paul Nathan already gave a pretty complete answer as to how to read I/O from a program as binary, so I'll just add why your code didn't work: because you explicitely asked SBCL to interpret the I/O as a string of UTF-8 characters, using with-{in,out}put-to-string.

Also, I'd like to point that you don't need to go as far as run-program's source code to get to the solution. It's clearly documented in SBCL's manual.