I need to extract the samples of a single channel from a wav file that will contain up to 12 (11.1 format) channels. I know that within a normal stereo file samples are interleaved, first left, and then right, like so,
[1st L] [1st R] [2nd L] [2nd R]...
So, to read the left channel I'd do this,
for (var i = 0; i < myByteArray.Length; i += (bitDepth / 8) * 2)
{
// Get bytes and convert to actual samples.
}
And to get the right channel I'd simply do for (var i = (bitDepth / 8)...
.
But, what order is used for files with more than 2 channels?
Microsoft have created a standard that covers up to 18 channels. According to them, the wav file needs to have a special meta sub-chunk (under the "Extensible Format" section) that specifies a "channel mask" (dwChannelMask
). This field is 4 bytes long (a uint
) which contains the corresponding bits of each channel that is present, therefore indicating which of the 18 channels are used within the file.
The Master Channel Layout
Below is the MCL, that is, the order in which existing channels should be interleaved, along with the bit value for each channel. If a channel is not present, the next channel that is there will "drop down" into the place of the missing channel and its order number will be used instead, but never the bit value. (Bit values are unique to each channel regardless of the channel's existence),
Order | Bit | Channel
1. 0x1 Front Left
2. 0x2 Front Right
3. 0x4 Front Center
4. 0x8 Low Frequency (LFE)
5. 0x10 Back Left (Surround Back Left)
6. 0x20 Back Right (Surround Back Right)
7. 0x40 Front Left of Center
8. 0x80 Front Right of Center
9. 0x100 Back Center
10. 0x200 Side Left (Surround Left)
11. 0x400 Side Right (Surround Right)
12. 0x800 Top Center
13. 0x1000 Top Front Left
14. 0x2000 Top Front Center
15. 0x4000 Top Front Right
16. 0x8000 Top Back Left
17. 0x10000 Top Back Center
18. 0x20000 Top Back Right
For example, if the channel mask is 0x63F
(1599), this would indicate that the file contains 8 channels (FL, FR, FC, LFE, BL, BR, SL & SR).
Reading and checking the Channel Mask
To get the mask, you'll need to read the 40th, 41st, 42nd and 43rd byte (assuming a base index of 0, and you're reading a standard wav header). For example,
var bytes = new byte[50];
using (var stream = new FileStream("filepath...", FileMode.Open))
{
stream.Read(bytes, 0, 50);
}
var speakerMask = BitConverter.ToUInt32(new[] { bytes[40], bytes[41], bytes[42], bytes[43] }, 0);
Then, you need to check if the desired channel actually exists. To do this, I'd suggest creating an enum
(defined with [Flags]
) that contains all the channels (and their respective values).
[Flags]
public enum Channels : uint
{
FrontLeft = 0x1,
FrontRight = 0x2,
FrontCenter = 0x4,
Lfe = 0x8,
BackLeft = 0x10,
BackRight = 0x20,
FrontLeftOfCenter = 0x40,
FrontRightOfCenter = 0x80,
BackCenter = 0x100,
SideLeft = 0x200,
SideRight = 0x400,
TopCenter = 0x800,
TopFrontLeft = 0x1000,
TopFrontCenter = 0x2000,
TopFrontRight = 0x4000,
TopBackLeft = 0x8000,
TopBackCenter = 0x10000,
TopBackRight = 0x20000
}
And then finally check if the channel is present.
What if the Channel Mask doesn't exist?
Create one yourself! Based on the file's channel count you will either have to guess which channels are used, or just blindly follow the MCL. In the below code snippet we're doing a bit of both,
public static uint GetSpeakerMask(int channelCount)
{
// Assume setup of: FL, FR, FC, LFE, BL, BR, SL & SR. Otherwise MCL will use: FL, FR, FC, LFE, BL, BR, FLoC & FRoC.
if (channelCount == 8)
{
return 0x63F;
}
// Otherwise follow MCL.
uint mask = 0;
var channels = Enum.GetValues(typeof(Channels)).Cast<uint>().ToArray();
for (var i = 0; i < channelCount; i++)
{
mask += channels[i];
}
return mask;
}
Extracting the samples
To actually read samples of a particular channel, you do exactly the same as if the file were stereo, that is, you increment your loop's counter by frame size (in bytes).
frameSize = (bitDepth / 8) * channelCount
You also need to offset your loop's starting index. This is where things become more complicated, as you have to start reading data from the channel's order number based on existing channels, times byte depth.
What do I mean "based on existing channels"? Well, you need to reassign the existing channels' order number from 1, incrementing the order for each channel that is present. For example, the channel mask 0x63F
indicates that the FL, FR, FC, LFE, BL, BR, SL & SR channels are used, therefore the new channel order numbers for the respective channels would look like this (note, the bit values are not and should not ever be changed),
Order | Bit | Channel
1. 0x1 Front Left
2. 0x2 Front Right
3. 0x4 Front Center
4. 0x8 Low Frequency (LFE)
5. 0x10 Back Left (Surround Back Left)
6. 0x20 Back Right (Surround Back Right)
7. 0x200 Side Left (Surround Left)
8. 0x400 Side Right (Surround Right)
You'll notice that the FLoC, FRoC & BC are all missing, therefore the SL & SR channels "drop down" into the next lowest available order numbers, rather than using the SL & SR's default order (10, 11).
Summing up
So, to read the bytes of a single channel you'd need to do something similar to this,
// This code will only return the bytes of a particular channel. It's up to you to convert the bytes to actual samples.
public static byte[] GetChannelBytes(byte[] audioBytes, uint speakerMask, Channels channelToRead, int bitDepth, uint sampleStartIndex, uint sampleEndIndex)
{
var channels = FindExistingChannels(speakerMask);
var ch = GetChannelNumber(channelToRead, channels);
var byteDepth = bitDepth / 8;
var chOffset = ch * byteDepth;
var frameBytes = byteDepth * channels.Length;
var startByteIncIndex = sampleStartIndex * byteDepth * channels.Length;
var endByteIncIndex = sampleEndIndex * byteDepth * channels.Length;
var outputBytesCount = endByteIncIndex - startByteIncIndex;
var outputBytes = new byte[outputBytesCount / channels.Length];
var i = 0;
startByteIncIndex += chOffset;
for (var j = startByteIncIndex; j < endByteIncIndex; j += frameBytes)
{
for (var k = j; k < j + byteDepth; k++)
{
outputBytes[i] = audioBytes[(k - startByteIncIndex) + chOffset];
i++;
}
}
return outputBytes;
}
private static Channels[] FindExistingChannels(uint speakerMask)
{
var foundChannels = new List<Channels>();
foreach (var ch in Enum.GetValues(typeof(Channels)))
{
if ((speakerMask & (uint)ch) == (uint)ch)
{
foundChannels.Add((Channels)ch);
}
}
return foundChannels.ToArray();
}
private static int GetChannelNumber(Channels input, Channels[] existingChannels)
{
for (var i = 0; i < existingChannels.Length; i++)
{
if (existingChannels[i] == input)
{
return i;
}
}
return -1;
}