How do you remove "popping" and "clicking" sounds in audio constructed by concatenating sound tonal sound clips together?
I have this PyAudio code for generating a series of tones:
import time
import math
import pyaudio
class Beeper(object):
def __init__(self, **kwargs):
self.bitrate = kwargs.pop('bitrate', 16000)
self.channels = kwargs.pop('channels', 1)
self._p = pyaudio.PyAudio()
self.stream = self._p.open(
format = self._p.get_format_from_width(1),
channels = self.channels,
rate = self.bitrate,
output = True,
)
self._queue = []
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self.stream.stop_stream()
self.stream.close()
def tone(self, frequency, length=1000, play=False, **kwargs):
number_of_frames = int(self.bitrate * length/1000.)
##TODO:fix pops?
g = get_generator()
for x in xrange(number_of_frames):
self._queue.append(chr(int(math.sin(x/((self.bitrate/float(frequency))/math.pi))*127+128)))
def play(self):
sound = ''.join(self._queue)
self.stream.write(sound)
time.sleep(0.1)
with Beeper(bitrate=88000, channels=2) as beeper:
i = 0
for f in xrange(1000, 800-1, int(round(-25/2.))):
i += 1
length = log(i+1) * 250/2./2.
beeper.tone(frequency=f, length=length)
beeper.play()
but when the tones changes, there's a distinctive "pop" in the audio, and I'm not sure how to remove it.
At first, I thought the pop was occurring because I was immediately playing each clip, and the time between each playback when I generate the clip was enough of a delay to cause the audio to flatline. However, when I concatenated all the clips into a single string and played that, the pop was still there.
Then, I thought the sine-waves weren't matching at the boundaries for each clip, so I tried to average the first N frames of the current audio clip with the last N frames of the previous clip, but that also had no effect.
What am I doing wrong? How do I fix this?
If you are concatenating clips of varying attributes, you may hear clicking sound if peaks of two clips at the points of concatenation does not align.
One way to get around this is to do
Fade-out
at the end of first signal and thenfade-in
at the beginning of second signal. then continue this pattern through rest of the concatenation process. Check here for details onFading
.I would try out concatenation in visual tools like Audacity , try
Fade-out
andfade-in
on clips you want to join and play around with timing and settings to get desired results.Next, I am not sure
pyAudio
has any easy way of implementationfading
, however, if you can , you may want to try pyDub. It provides easy ways to manipulate audio. It has bothFade-in
andFade-out
methods as well ascross-fade
method, which basically performs both fade in and out in one step.You can install
pydub
aspip install pydub
Here is a sample code for pyDub:
Finally, if you really want to get noise / pops cleared at a professional grade, you may want to look at
PSOLA
(Pitch Synchronous Overlap and Add) . Here one would convert audio signals tofrequency domain
and then performPSOLA
on chunks to merge the audio with minimum possible noise.That was long, but hope it helps.
The answer you've written for yourself will do the trick but isn't really the correct way to do this type of thing.
One of the problems is your checking for the "tip" or peak of the sine wave by comparing against 1. Not all sine frequencies will hit that value or may require a large number of cycles to do so.
Mathematically speaking, the peak of the sine is at sin(pi/2 + 2piK) for all integer values of K.
To compute sine for a given frequency you use the formula y = sin(2pi * x * f0/fs) where x is the sample number, f0 is the sine frequency and fs is the sample rate.
For a nice number like 1kHz at 48kHz sample rate, when x=12 then:
However at a frequency like 997Hz then the true peak falls a fraction of a sample after sample 12.
A better method of stitching the waveforms together is to keep track of the phase from one tone and use that as the starting phase for the next.
First, for a given frequency you need to figure out the phase increment, notice it is the same as what you are doing with the sample factored out:
Next, compute the sine and update a variable representing the current phase.
Putting it all together:
My initial suspicion that the individual waveforms weren't aligning was correct, which I confirmed by inspecting in Audacity. My solution was to modify the code to start and stop each waveform on the peak of the sine wave.