Python: Strange hanging behavior when piping large

I am currently calling ffmpeg to extract a binary data stream from a video file, and then putting that binary data into a list. There is a lot of data in this data stream, about 4,000 kb. Here is the code

# write and call ffmpeg command, piping stdout
cmd = "ffmpeg -i video.mpg -map 0:1 -c copy -f data -"
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)

# read from stdout, byte by byte
li = []
for char in iter(lambda: proc.stdout.read(1), ""):
    li.append(char)

This works fine. However, if I take out the part where I am reading from stdout, it starts working but then hangs:

cmd = "ffmpeg -i video.mpg -map 0:1 -c copy -f data -"
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)
time.sleep(10)

I had to add time.sleep(10) at the end or else the process would end before the subprocess, causing this error:

av_interleaved_write_frame(): Invalid argument
Error writing trailer of pipe:: Invalid argument
size=       0kB time=00:00:00.00 bitrate=N/A speed=N/A
video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing ove
rhead: 0.000000%
Conversion failed!

Calling either subprocess.call(cmd, stdout=subprocess.PIPE) or subprocess.call(cmd) also cause hanging (the latter just displays the stdout in the console while the former doesn't).

Is there something about reading from stdout that prevents this hanging (like perhaps the buffer getting cleared), or am I unknowingly introducing a bug elsewhere? I'm worried that such a small change causes the program to break; it doesn't inspire very much confidence.

The other issue with this code is that I need to read from the list from another thread. This might mean I need to use a Queue. But when I execute the below code, it takes 11 seconds as opposed to 3 seconds with the list equivalent:

cmd = "ffmpeg -i video.mpg -loglevel panic -hide_banner -map 0:1 -c copy -f data -"
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)

q = Queue()

for char in iter(lambda: proc.stdout.read(1), ""):
    q.put(char)

Should I be using another data structure?

Reading data from the pipe one byte at a time is really inefficient. You should read bigger chunks.
Executing the subprocess and then terminating the parent without waiting for the child to finish will cause a broken pipe error and the subprocess will fail, as you noticed.
Calling subprocess.call(cmd, stdout=subprocess.PIPE) will block/stall the writer if the OS buffer gets filled (ie. if you don't read from the pipe like in your case).
Queue is fine as long as you don't read one byte at a time