Finding the 'volume' of a .wav at a given

2019-05-07 05:47发布

I am working on a small example application for my fourth year project (dealing with Functional Reactive Programming). The idea is to create a simple program that can play a .wav file and then shows a 'bouncing' animation of the current volume of the playing song (like in audio recording software). I'm building this in Scala so have mainly been looking at Java libraries and existing solutions.

Currently, I have managed to play a .wav file easily but I can't seem to achieve the second goal. Basically is there a way I can decode a .wav file so I can have someway of accessing the 'volume' at any given time? By volume I think I means its amplitude but I may be wrong about this - Higher Physics was a while ago....

Clearly, I don't know much about this at all so it would be great if someone could point me in the right direction!

2条回答
淡お忘
2楼-- · 2019-05-07 06:03

In digital audio processing you typically refer to the momentary peak amplitude of the signal (this is also called PPM -- peak programme metering). Depending on how accurate you want to be or if you wish to model some standardised metering or not, you could either

  • just use a sliding window of sample frames (find the maximum absolute value per window)
  • implement some sort of peak-hold mechanism that retains the last peak value for a given duration and then start to have the value 'fall' by a given amount of decibels per second.

The other measuring mode is RMS which is calculated by integrating over a certain time window (add the squared sample values, divide by the window length, and take the square-root, thus root-mean-square RMS). This gives a better idea of the 'energy' of the signal, moving smoother than peak measurements, but not capturing the maximum values observed. This mode is sometimes called VU meter as well. You can approximate this with a sort of lagging (lowpass) filter, e.g. y[i] = y[i-1]*a + |x[i]|*(a-1), for some value 0 < a < 1

You typically display the values logarithmically, i.e. in decibels, as this corresponds better with our perception of signal strength and also for most signals produces a more regular coverage of your screen space.

Three projects I'm involved with may help you:

  • ScalaAudioFile which you can use to read the sample frames from an AIFF or WAVE file
  • ScalaAudioWidgets which is a still young and incomplete project to provide some audio application widgets on top of scala-swing, including a PPM view -- just use a sliding window and set the window's current peak value (and optionally RMS) at a regular interval, and the view will take care of peak-hold and fall times
  • (ScalaCollider, a client for the SuperCollider sound synthesis system, which you might use to play back the sound file and measure the peak and RMS amplitudes in real time. The latter is probably an overkill for your project and would involve some serious learning curve if you have never heard of SuperCollider. The advantage would be that you don't need to worry about synchronising your sound playback with the meter display)
查看更多
SAY GOODBYE
3楼-- · 2019-05-07 06:27

In a wav file, the data at a given point in the stream IS the volume (shifted by half of the dynamic range). In other words, if you know what type of wav file (for example 8 bit, mono) each byte represents a single sample. If you know the sample rate (say 44100 HZ) then multiply the time by 44100 and that is the byte you want to look at.

The value of the byte is the volume (distance from the middle.. 0 and 255 are the peaks, 127 is zero). This is assuming that the encoding is not mu-law encoding. I found some good info on how to tell the difference, or better yet, convert between these formats here:

http://www.gnu.org/software/octave/doc/interpreter/Audio-Processing.html

You may want to average these samples though over a window of some fixed number of samples.

查看更多
登录 后发表回答