I am using libswresample to resample from any PCM format to 44.1kHz, 16bit int, stereo.
I was playing around with some audio volume analyzing of the resulting audio stream and I figured out that in case I have 44.1kHz, 16bit int mono as the source, I have roughly the formular:
leftSample = sourceSample / sqrt(2);
rightSample = sourceSample / sqrt(2);
But I was expecting:
leftSample = sourceSample;
rightSample = sourceSample;
(In case the source is stereo, I simply have leftSample = leftSourceSample; rightSample = rightSourceSample;
.)
My expectation comes from several sources:
- That is how my own straight forward solution would probably have been.
- I searched a bit around and other people seem to do the same, e.g. here.
In a very common ReplayGain implementation (the only one I know actually, used basically everywhere, I think initially from mp3gain; one copy can be seen here), it also does it:
switch ( num_channels) { case 1: right_samples = left_samples; case 2: break; default: return GAIN_ANALYSIS_ERROR; }
This is esp. relevant because ReplayGain was calibrated by this implementation using a reference sound (a pink noise, can be downloaded here) which is in mono.
In the ReplayGain specification, it is also calculated like this (see here).
My confusion raised after I tried to implement ReplayGain myself and I stumbled upon this.
So, some questions:
- Why does libswresample do this?
- Is this expected in libswresample or a bug? (I'm trying to understand from the source (e.g. here) but I haven't fully understood it all yet.) I added a bug report here.
- What is the "right" solution?
- What are other players doing?
- What is a common soundcard doing if you feed mono samples to it?
(I also posted this question on avp.stackexchange now; maybe that is a better place to ask about this, not sure.)