I am trying to open a sound file in R, but the load.wave()
function complains that the file is "incomplete". The sound plays well on a number of other audio software (mplayer, Audacity, Praat, etc) and file
does not report it to be any different from other WAV files with which there is no problem:
RIFF (little-endian) data, WAVE audio, Microsoft PCM, 16 bit, mono 22050 Hz
I know load.wave()
internally calls a C function to process the data, but I don't know what that function is, or what it does (so I can't see why it's complaining). The call from load.wave()
is defined in R as .Call("load_wave_file", where, PACKAGE = "audio")
, where where
is the path to the file.
Opening the sound in Audacity and saving it again as a WAV file generates an identical sounding file which can be opened in R without any problems.
However, the files seem to be considerably different. Using vbindiff
, there are differences both in the header:
# Original file
0000 0000: 52 49 46 46 39 AE 02 00 57 41 56 45 66 6D 74 20 RIFF9... WAVEfmt
0000 0010: 12 00 00 00 01 00 01 00 22 56 00 00 44 AC 00 00 ........ "V..D...
0000 0020: 02 00 10 00 00 00 64 61 74 61 D4 AD 02 00 F9 FF ......da ta......
# Fixed file
0000 0000: 52 49 46 46 F8 AD 02 00 57 41 56 45 66 6D 74 20 RIFF.... WAVEfmt
0000 0010: 10 00 00 00 01 00 01 00 22 56 00 00 44 AC 00 00 ........ "V..D...
0000 0020: 02 00 10 00 64 61 74 61 D4 AD 02 00 FA FF F6 FF ....data ........
and throughout the file:
More interestingly, a chunk at the end of the original file has been removed:
# Original file
0002 ADF0: 5E 00 5D 00 5F 00 5F 00 5F 00 5F 00 5E 00 5D 00 ^.]._._. _._.^.].
0002 AE00: 5B 00 63 75 65 20 1C 00 00 00 01 00 00 00 01 00 [.cue .. ........
0002 AE10: 00 00 88 58 01 00 64 61 74 61 00 00 00 00 00 00 ...X..da ta......
0002 AE20: 00 00 88 58 01 00 4C 49 53 54 13 00 00 00 61 64 ...X..LI ST....ad
0002 AE30: 74 6C 6C 61 62 6C 07 00 00 00 01 00 00 00 52 54 tllabl.. ......RT
0002 AE40: 00
# Fixed file
0002 ADF0: 5E 00 5F 00 5F 00 5F 00 5F 00 5D 00 5F 00 59 00 ^._._._. _.]._.Y.
0002 AE00:
0002 AE10:
0002 AE20:
0002 AE30:
0002 AE40:
1. What is wrong with this file, that prevents me from opening it?
2. What is the data at the end of the original file? (See below)
I know there are multiple audio processing programs out there which are rather liberal with the WAV spec, so this type of problem is not uncommon. I just want to figure out what is going on, to maybe implement a fix (which doesn't require me to fire up Audacity) and to prevent it from happening again in the future.
Update:
This chunk seems to be a "Cue-Points Chunk", as explained here:
The cue-points chunk identifies a series of positions in the waveform data stream.
I guess this means it should be harmless, but is that what's causing the problem?