I'm trying to grab samples from a stream of audio and put them in a shared Queue. I have another process that pulls from this queue.
When I run, I get this error:
* recording
Traceback (most recent call last):
File "record.py", line 43, in <module>
data = stream.read(CHUNK)
File "/Library/Python/2.7/site-packages/pyaudio.py", line 605, in read
return pa.read_stream(self._stream, num_frames)
IOError: [Errno Input overflowed] -9981
EDIT: Apparently problem has been around for a while with no solution posted (I tried their suggestions):
Geting IOError: [Errno Input overflowed] -9981 when setting PyAudio Stream input and output to True
- https://github.com/jeysonmc/python-google-speech-scripts/issues/1
Here's (simplified) code:
import pyaudio
import wave
import array
import time
from multiprocessing import Queue, Process
CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 44100
RECORD_SECONDS = 2
p = pyaudio.PyAudio()
left = Queue()
right = Queue()
def other(q1, q2):
while True:
try:
a = q1.get(False)
except Exception:
pass
try:
b = q2.get(False)
except Exception:
pass
stream = p.open(format=FORMAT,
channels=CHANNELS,
rate=RATE,
input=True,
frames_per_buffer=CHUNK)
print("* recording")
Process(target=other, args=(left, right)).start()
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
data = stream.read(CHUNK)
byte_string = ''.join(data)
nums = array.array('h', byte_string)
for elt in nums[1::2]:
left.put(elt)
for elt in nums[0::2]:
right.put(elt)
print("* done recording")
stream.stop_stream()
stream.close()
print "terminated"
What am I doing wrong? I'm on Mac OSX and Python 2.7, I installed portaudio
through homebrew
and tried both the pip
and dmg installation of `pyaudio with no luck with either.
The buffer overflow error is because your frames_per_buffer and read chunksize might be too small. Try larger values for them i.e. 512,2048, 4096, 8192 etc
Update
ok i think I got it. the pyaudio library will raise an overflow error if you wait too long before your next read.
this does a whole lot of very slow processing here. especially the two
for
loops which are in python. let the processing process get a whole chunk of mono-stream data which it can deal with instead of one int at a time....
I don't even want to imagine what the
for
loops were doing to latency, but even the relatively minor performance improvement between usingarray
andnp.array
are worth noting: