I'm building a cross-platform web app where audio is generated on-the-fly on the server and live streamed to a browser client, probably via the HTML5 audio element. On the browser, I'll have Javascript-driven animations that must precisely sync with the played audio. "Precise" means that the audio and animation must be within a second of each other, and hopefully within 250ms (think lip-syncing). For various reasons, I can't do the audio and animation on the server and live-stream the resulting video.
Ideally, there would be little or no latency between the audio generation on the server and the audio playback on the browser, but my understanding is that latency will be difficult to control and probably in the 3-7 second range (browser-, environment-, network- and phase-of-the-moon-dependent). I can handle that, though, if I can precisely measure the actual latency on-the-fly so that my browser Javascript knows when to present the proper animated frame.
So, I need to precisely measure the latency between my handing audio to the streaming server (Icecast?), and the audio coming out of the speakers on the computer hosting the speaker. Some blue-sky possibilities:
Add metadata to the audio stream, and parse it from the playing audio (I understand this isn't possible using the standard audio element)
Add brief periods of pure silence to the audio, and then detect them on the browser (can audio elements yield the actual audio samples?)
Query the server and the browser as to the various buffer depths
Decode the streamed audio in Javascript and then grab the metadata
Any thoughts as to how I could do this?
Utilize
timeupdate
event of<audio>
element, which is fired three to four times per second, to perform precise animations during streaming of media by checking.currentTime
of<audio>
element. Where animations or transitions can be started or stopped up to several times per second.If available at browser, you can use
fetch()
to request audio resource, at.then()
returnresponse.body.getReader()
which returns aReadableStream
of the resource; create a newMediaSource
object, set<audio>
ornew Audio()
.src
toobjectURL
of theMediaSource
; append first stream chunks at.read()
chained.then()
tosourceBuffer
ofMediaSource
with.mode
set to"sequence"
; append remainder of chunks tosourceBuffer
atsourceBuffer
updateend
events.If
fetch()
response.body.getReader()
is not available at browser, you can still usetimeupdate
orprogress
event of<audio>
element to check.currentTime
, start or stop animations or transitions at required second of streaming media playback.Use
canplay
event of<audio>
element to play media when stream has accumulated adequate buffers atMediaSource
to proceed with playback.You can use an object with properties set to numbers corresponding to
.currentTime
of<audio>
where animation should occur, and values set tocss
property of element which should be animated to perform precise animations.At
javascript
below, animations occur at every twenty second period, beginning at0
, and at every sixty seconds until the media playback has concluded.plnkr http://plnkr.co/edit/fIm1Qp?p=preview
There no way for you to measure latency directly, but any AudioElement generate events like 'playing' if it just played (fired quite often), or 'stalled' if stoped streaming, or 'waiting' if data is loading. So what you can do, is to manipulate your video based on this events.
So play while stalled or waiting is fired, then continue playing video if playing fired again.
But I advice you check other events that might affect your flow (error for example would be important for you).
https://developer.mozilla.org/en-US/docs/Web/API/HTMLAudioElement
What i would try is first create a timestamp with performance.now, process the data, and record it in a blob with the new web recorder api.
The web recorder will ask user access to his audio card, this can be a problem for your app, but it look like mandatory to get the real latency.
As soon this done, there is many way to measure the actual latency between the generation and the actual rendering. Basically, a sound event.
For further reference and example:
Recorder demo
https://github.com/mdn/web-dictaphone/
https://developer.mozilla.org/en-US/docs/Web/API/MediaRecorder_API/Using_the_MediaRecorder_API