Does Twilio have any support for pausing and resuming playback of content. In other words, I have fairly long files that will be played to the caller, and I'm trying to find a way to implement pause & resume functionality. In the middle of a play of some content, I want to have user ability to press a digit to pause, and then later press a digit again to resume play from the same point in audio file where it was paused.
Does Twilio support something like that?
I did find a workable, albeit not perfect, solution to how to pause and resume a playback with Twilio.
The basic idea is to calculate the difference in time between when the play command was generated and the time when Gather
URL is called. The difference (assuming perfect world for a moment) should be how far the content was played before being interrupted by the caller. Then, when caller is ready to resume, generate Play
command such that it causes your app server to deliver not full content, but rather partial offset'ed content that begins right at a point where playback should resume (which likely implies that a mechanism to deliver only part of audio file content would need to be implemented). This would essentially emulate pause/resume functionality.
I've implemented this and it more-or-less works. The imperfect world comes into play where network latency, delays in processing (time between Twilio receives the Play
command, retrieves the play resource, and actually starts playing), as well as delay between hitting the button and actually receiving the Gather
call all affect the accuracy. But if your requirements aren't too strict, the accuracy is probably decent enough for most cases.
Here is the proof-of-concept that I did in C# (it's been a few months - hope it still works as posted). It also includes experimentation with fast-forwarding and rewinding, which is simply adjusting where resume actually starts (and skipping Pause
command).
The code below is for PausablePlayController.cs which generates the TwiML with Play
, Pause
, and other commands.
Play
action (not TwiML command) generates TwiML for playing content. The playback is interrupt-able as it is wrapped in Gather
which points to Pause
action. The URL of Gather
contains a time-stamp of where playback began (and in case it was already offset previously, calculates it back in time).
Pause
action (not TwiML command) generates TwiML for doing the pause or seeking. In code below 4 rewinds, 5 restarts from beginning, 6 fast-forwards, and any other key does the pause.
public class PausablePlayController : ApiController
{
private const int seekDeltaMilliseconds = 5000;
// GET api/pausableplay/5
[HttpGet]
public System.Xml.Linq.XElement Play(string audio, int millisecondsOffset)
{
TwilioResponse twiml = new TwilioResponse();
twiml.BeginGather(new { action = this.Url.Link("PausablePlayPause", new { audio = audio, playStart = DateTime.UtcNow.Subtract(new TimeSpan(0, 0, 0, 0, millisecondsOffset)).Ticks/*.ToString("o", System.Globalization.CultureInfo.InvariantCulture )*/ }), method = "GET", numDigits = "1" });
twiml.Play(this.Url.Link("OffsetPresentations", new { audio = audio, millisecondsOffset = millisecondsOffset }));
twiml.EndGather();
return twiml.Element;
}
[HttpGet]
public System.Xml.Linq.XElement Pause(string audio, long playStart, int digits)
{
DateTime playStartDate = new DateTime(playStart, DateTimeKind.Utc);
int millisecondsOffset = (int)DateTime.UtcNow.Subtract(playStartDate).TotalMilliseconds;
TwilioResponse twiml = new TwilioResponse();
switch(digits)
{
case 4:
millisecondsOffset -= (millisecondsOffset < seekDeltaMilliseconds) ? millisecondsOffset : seekDeltaMilliseconds;
return Play(audio, millisecondsOffset);
case 5:
return Play(audio, 0);
case 6:
millisecondsOffset += seekDeltaMilliseconds;
return Play(audio, millisecondsOffset);
default:
{
twiml.BeginGather(new { action = this.Url.Link("PausablePlayPlay", new { audio = audio, millisecondsOffset = millisecondsOffset }), method = "GET", numDigits = "1" });
twiml.Pause(120);
twiml.EndGather();
twiml.Say("Goodbye!");
}
break;
}
return twiml.Element;
}
}
The rest of the trick is in this next controller that streams partial audio content (parts of code I found in some other post to which I no longer have reference, unfortunately). All it does is simply calculates where the audio content for a given milliseconds of offset begins and streams the rest of the content from that point.
public class OffsetedContentController : ApplicationController
{
const int BufferSize = 32 * 1024;
// GET api/prompts/5
public Task<HttpResponseMessage> Get(string audio, [FromUri]int millisecondsOffset)
{
string contentFilePath = audio; // Build physical path for your audio content
if (!File.Exists(contentFilePath))
{
return Task.FromResult(Request.CreateResponse(HttpStatusCode.NotFound));
}
// Open file and read response from it. If read fails then return 503 Service Not Available
try
{
// Create StreamContent from FileStream. FileStream will get closed when StreamContent is closed
FileStream fStream = new FileStream(contentFilePath, FileMode.Open, FileAccess.Read, FileShare.Read, BufferSize, useAsync: true);
fStream.Position = getPositionOffset(millisecondsOffset);
HttpResponseMessage response = Request.CreateResponse();
response.Content = new StreamContent(fStream);
response.Content.Headers.ContentType = new System.Net.Http.Headers.MediaTypeHeaderValue("audio/ulaw");
return Task.FromResult(response);
}
catch (Exception e)
{
return Task.FromResult(Request.CreateErrorResponse(HttpStatusCode.ServiceUnavailable, e));
}
}
private long getPositionOffset(int millisecondsOffset)
{
long bytePosition = millisecondsOffset * 4;
return bytePosition;
}
}
There's no way of doing this using TwiML. You can only do have this logic on your server and then reinitiate the audio file with another Play tag but I believe that will introduce quite a bit of a delay as Twilio will need to download the audio file from your server and then transcode it before it can be played (in addition to the time your server needs to regenerate the new audio file)