Read from initial stdin in GO?

2019-01-16 16:25发布

问题:

I would like to read from the original stdin of a go program. For example, if I did echo test stdin | go run test.go, I would want to have access to "test stdin". I've tried reading from os.Stdin, but if there's nothing in it, then it will wait for input. I also tried checking the size first, but the os.Stdin.Stat().Size() is 0 even when input is passed in.

What can I do?

回答1:

I think your question per se has no sensible answer because there's just no such thing as "initial stdin". Unix-like OSs, and Windows implement the concept of "standard streams", which works like this (simplified): when a process is created, it automagically has three file descriptors (handles in Windows) open — stdin, stdout and stderr. No doubts, you're familiar with this concept, but I'd like to stress the meaning of the word "stream" there — in your example, when you call

$ echo 'test stdin' | ./stdin

the shell creates a pipe, spawns two processes (one for echo and one for your binary) and makes use of the pipe it created: the pipe's write FD is attached to the echo's stdout and the pipe's read FD is attached to your binary's stdin. Then whatever the echo process pleases to write to its stdout is piped (sic!) to the stdin of your process. (In reality most today's shells implement echo as a built-in primitive but this does not in any way change the semantics; your could as well have tried /bin/echo instead, which is a real program. Also note that I just used ./stdin to refer to your program — this is for clarity, as go run stdin.go would do exactly this, in the end.)

Note several crucial things here:

  • The writing process (echo in your case) is not oblidged to write anything to its stdout (for instance, echo -n would not write anything to its stdout and exit successfully).
  • It's also able to make arbitrary delays writing its data (either because it wants to make such delays or because it has been preempted by the OS or sleeps in some syscall waiting on some busy system resource etc).
  • The OS buffers transfers over pipes. This means what the writing process sends to a pipe, might come out in arbitrary chunks on the reading side.1
  • There are only two ways to know the writing side has no more data to send over the pipe:
    • Somehow encode this in the data itself (this means using an agreed upon data transfer protocol between the writer and the reader).
    • The writer might close its side of the pipe which would result in the "end of file" condition on the reader side (but only after the buffer is drained and one another call to read is attempted, which fails).

Let's wrap this up: the behaviour you're observing is correct and normal. If you expect to get any data from stdin, you must not expect it to be readily available. If you also don't want to block on stdin, then create a goroutine which would do blocking reads from stdin in an endless loop (but checking for the EOF condition) and pass collected data up over a channel (possibly after certain processing, if needed).

1 This is why certain tools which usually occur between two pipes in a pipeline, such as grep, might have special options to make them flush their stdout after writing each line — read about the --line-buffered option in the grep manual page for one example. People who are not aware of this "full buffering by default" semantics are puzzled why tail -f /path/to/some/file.log | grep whatever | sed ... seems to stall and not display anything when it's obvious the monitored file gets updated.


As a side note: if you were to run your binary "as is", like in

$ ./stdin

that would not meant the spawned process would not have stdin (or "initial stdin" or whaveter), instead, its stdin would be connected to the same stream your shell receives your keyboard import from (so you could directly type something to your process's stdin).

The only sure way to have a process's stdin connected to nowhere is to use

$ ./stdin </dev/null

on Unix-like OSes and

C:\> stdin <NUL

on Windows. This "null device" makes the process see EOF on the first read from its stdin.



回答2:

Reading from stdin using os.Stdin should work as expected:

package main

import "os"
import "log"
import "io/ioutil"

func main() {
    bytes, err := ioutil.ReadAll(os.Stdin)

    log.Println(err, string(bytes))
}

Executing echo test stdin | go run stdin.go should print 'test stdin' just fine.

It would help if you'd attach the code you used to identify the problem you encountered.

For line based reading you can use bufio.Scanner:

import "os"
import "log"
import "bufio"

func main() {
    s := bufio.NewScanner(os.Stdin)
    for s.Scan() {
        log.Println("line", s.Text())
    }
}


回答3:

You can't check stdin for content, but you can check if stdin is associated with a terminal or a pipe. IsTerminal just takes the standard unix fd numbers (0,1,2). The syscall package has variables assigned so you can do syscall.Stdin if you prefer naming them.

package main

import (
    "code.google.com/p/go.crypto/ssh/terminal"
    "fmt"
    "io/ioutil"
    "os"
)

func main() {
    if ! terminal.IsTerminal(0) {
        b, _ := ioutil.ReadAll(os.Stdin)
        fmt.Print(string(b))
    } else {
        fmt.Println("no piped data")
    }
}


标签: go