The goal of this question is trying to figure out how to play streaming audio using pyglet. The first is just making sure you're able to play mp3 files using pyglet, that's the purpose of this first snippet:
import sys
import inspect
import requests
import pyglet
from pyglet.media import *
pyglet.lib.load_library('avbin')
pyglet.have_avbin = True
def url_to_filename(url):
return url.split('/')[-1]
def download_file(url, filename=None):
filename = filename or url_to_filename(url)
with open(filename, "wb") as f:
print("Downloading %s" % filename)
response = requests.get(url, stream=True)
total_length = response.headers.get('content-length')
if total_length is None:
f.write(response.content)
else:
dl = 0
total_length = int(total_length)
for data in response.iter_content(chunk_size=4096):
dl += len(data)
f.write(data)
done = int(50 * dl / total_length)
sys.stdout.write("\r[%s%s]" % ('=' * done, ' ' * (50 - done)))
sys.stdout.flush()
url = "https://freemusicarchive.org/file/music/ccCommunity/DASK/Abiogenesis/DASK_-_08_-_Protocell.mp3"
filename = "mcve.mp3"
download_file(url, filename)
music = pyglet.media.load(filename)
music.play()
pyglet.app.run()
If you've installed the libraries pip install pyglet requests
and also installed AVBin at this point you should be able to listen the mp3 once it's been downloaded.
Once we've reached this point, I'd like to figure out how to play & buffering the file in a similar way to mostly of the existing web video/audio players using pyglet+requests. This means playing the files without waiting till the file has been downloaded completely.
After reading the pyglet media docs you can see there are available these classes:
media
sources
base
AudioData
AudioFormat
Source
SourceGroup
SourceInfo
StaticSource
StreamingSource
VideoFormat
player
Player
PlayerGroup
I've seen there are another similar SO questions but they haven't been solved properly and their content doesn't provide a lot of relevant details:
- Play streaming audio using pyglet
- How can I play audio stream without saving it into the file with pyglet?
That's why I've created a new question. How do you play streaming audio using pyglet? Could you provide a little example using the above mcve as a base?
Assuming you don't want to import a new package to do this for you - this can be done with a bit of effort.
First, let's head over to the Pyglet source code and have a look at media.load
in media/__init__.py
.
"""Load a Source from a file.
All decoders that are registered for the filename extension are tried.
If none succeed, the exception from the first decoder is raised.
You can also specifically pass a decoder to use.
:Parameters:
`filename` : str
Used to guess the media format, and to load the file if `file` is
unspecified.
`file` : file-like object or None
Source of media data in any supported format.
`streaming` : bool
If `False`, a :class:`StaticSource` will be returned; otherwise
(default) a :class:`~pyglet.media.StreamingSource` is created.
`decoder` : MediaDecoder or None
A specific decoder you wish to use, rather than relying on
automatic detection. If specified, no other decoders are tried.
:rtype: StreamingSource or Source
"""
if decoder:
return decoder.decode(file, filename, streaming)
else:
first_exception = None
for decoder in get_decoders(filename):
try:
loaded_source = decoder.decode(file, filename, streaming)
return loaded_source
except MediaDecodeException as e:
if not first_exception or first_exception.exception_priority < e.exception_priority:
first_exception = e
# TODO: Review this:
# The FFmpeg codec attempts to decode anything, so this codepath won't be reached.
if not first_exception:
raise MediaDecodeException('No decoders are available for this media format.')
raise first_exception
add_default_media_codecs()
The critical line here is loaded_source = decoder.decode(...)
. Essentially, to load audio Pyglet takes a file and hauls it over to a media decoder (eg. FFMPEG), which then returns a list of 'frames' or packets that Pyglet can play with a built-in Player
class. If the audio format is compressed (eg. mp3 or aac), Pyglet will use an external library (currently only AVBin is supported) to convert it to raw, decompressed audio. You probably already know some of this.
So if we want to see how we can stuff a stream of bytes into Pyglet's audio engine rather than a file, we'll need to take a look at one of the decoders. For this example, let's use FFMPEG as it's the easiest to access.
In media/codecs/ffmpeg.py
:
class FFmpegDecoder(object):
def get_file_extensions(self):
return ['.mp3', '.ogg']
def decode(self, file, filename, streaming):
if streaming:
return FFmpegSource(filename, file)
else:
return StaticSource(FFmpegSource(filename, file))
The 'object' it inherits from is MediaDecoder
, found in media/codecs/__init__.py
. Back at the load
function in media/__init__.py
, you'll see pyglet will choose a MediaDecoder based on file extension, then return its decode
function with the file as a parameter to get the audio in the form of a packet stream. That packet stream is a Source
object; each decoder has its own flavor, in the form of StaticSource or StreamingSource. The former is used to store audio in memory, and the latter to play it immediately. FFmpeg's decoder only supports StreamingSource.
We can see that FFMPEG's is FFmpegSource, also located in media/codecs/ffmpeg.py
. We find this Goliath of a class:
class FFmpegSource(StreamingSource):
# Max increase/decrease of original sample size
SAMPLE_CORRECTION_PERCENT_MAX = 10
def __init__(self, filename, file=None):
if file is not None:
raise NotImplementedError('Loading from file stream is not supported')
self._file = ffmpeg_open_filename(asbytes_filename(filename))
if not self._file:
raise FFmpegException('Could not open "{0}"'.format(filename))
self._video_stream = None
self._video_stream_index = None
self._audio_stream = None
self._audio_stream_index = None
self._audio_format = None
self.img_convert_ctx = POINTER(SwsContext)()
self.audio_convert_ctx = POINTER(SwrContext)()
file_info = ffmpeg_file_info(self._file)
self.info = SourceInfo()
self.info.title = file_info.title
self.info.author = file_info.author
self.info.copyright = file_info.copyright
self.info.comment = file_info.comment
self.info.album = file_info.album
self.info.year = file_info.year
self.info.track = file_info.track
self.info.genre = file_info.genre
# Pick the first video and audio streams found, ignore others.
for i in range(file_info.n_streams):
info = ffmpeg_stream_info(self._file, i)
if isinstance(info, StreamVideoInfo) and self._video_stream is None:
stream = ffmpeg_open_stream(self._file, i)
self.video_format = VideoFormat(
width=info.width,
height=info.height)
if info.sample_aspect_num != 0:
self.video_format.sample_aspect = (
float(info.sample_aspect_num) /
info.sample_aspect_den)
self.video_format.frame_rate = (
float(info.frame_rate_num) /
info.frame_rate_den)
self._video_stream = stream
self._video_stream_index = i
elif (isinstance(info, StreamAudioInfo) and
info.sample_bits in (8, 16) and
self._audio_stream is None):
stream = ffmpeg_open_stream(self._file, i)
self.audio_format = AudioFormat(
channels=min(2, info.channels),
sample_size=info.sample_bits,
sample_rate=info.sample_rate)
self._audio_stream = stream
self._audio_stream_index = i
channel_input = avutil.av_get_default_channel_layout(info.channels)
channels_out = min(2, info.channels)
channel_output = avutil.av_get_default_channel_layout(channels_out)
sample_rate = stream.codec_context.contents.sample_rate
sample_format = stream.codec_context.contents.sample_fmt
if sample_format in (AV_SAMPLE_FMT_U8, AV_SAMPLE_FMT_U8P):
self.tgt_format = AV_SAMPLE_FMT_U8
elif sample_format in (AV_SAMPLE_FMT_S16, AV_SAMPLE_FMT_S16P):
self.tgt_format = AV_SAMPLE_FMT_S16
elif sample_format in (AV_SAMPLE_FMT_S32, AV_SAMPLE_FMT_S32P):
self.tgt_format = AV_SAMPLE_FMT_S32
elif sample_format in (AV_SAMPLE_FMT_FLT, AV_SAMPLE_FMT_FLTP):
self.tgt_format = AV_SAMPLE_FMT_S16
else:
raise FFmpegException('Audio format not supported.')
self.audio_convert_ctx = swresample.swr_alloc_set_opts(None,
channel_output,
self.tgt_format, sample_rate,
channel_input, sample_format,
sample_rate,
0, None)
if (not self.audio_convert_ctx or
swresample.swr_init(self.audio_convert_ctx) < 0):
swresample.swr_free(self.audio_convert_ctx)
raise FFmpegException('Cannot create sample rate converter.')
self._packet = ffmpeg_init_packet()
self._events = [] # They don't seem to be used!
self.audioq = deque()
# Make queue big enough to accomodate 1.2 sec?
self._max_len_audioq = 50 # Need to figure out a correct amount
if self.audio_format:
# Buffer 1 sec worth of audio
self._audio_buffer = \
(c_uint8 * ffmpeg_get_audio_buffer_size(self.audio_format))()
self.videoq = deque()
self._max_len_videoq = 50 # Need to figure out a correct amount
self.start_time = self._get_start_time()
self._duration = timestamp_from_ffmpeg(file_info.duration)
self._duration -= self.start_time
# Flag to determine if the _fillq method was already scheduled
self._fillq_scheduled = False
self._fillq()
# Don't understand why, but some files show that seeking without
# reading the first few packets results in a seeking where we lose
# many packets at the beginning.
# We only seek back to 0 for media which have a start_time > 0
if self.start_time > 0:
self.seek(0.0)
---
[A few hundred lines more...]
---
def get_next_video_timestamp(self):
if not self.video_format:
return
if self.videoq:
while True:
# We skip video packets which are not video frames
# This happens in mkv files for the first few frames.
video_packet = self.videoq[0]
if video_packet.image == 0:
self._decode_video_packet(video_packet)
if video_packet.image is not None:
break
self._get_video_packet()
ts = video_packet.timestamp
else:
ts = None
if _debug:
print('Next video timestamp is', ts)
return ts
def get_next_video_frame(self, skip_empty_frame=True):
if not self.video_format:
return
while True:
# We skip video packets which are not video frames
# This happens in mkv files for the first few frames.
video_packet = self._get_video_packet()
if video_packet.image == 0:
self._decode_video_packet(video_packet)
if video_packet.image is not None or not skip_empty_frame:
break
if _debug:
print('Returning', video_packet)
return video_packet.image
def _get_start_time(self):
def streams():
format_context = self._file.context
for idx in (self._video_stream_index, self._audio_stream_index):
if idx is None:
continue
stream = format_context.contents.streams[idx].contents
yield stream
def start_times(streams):
yield 0
for stream in streams:
start = stream.start_time
if start == AV_NOPTS_VALUE:
yield 0
start_time = avutil.av_rescale_q(start,
stream.time_base,
AV_TIME_BASE_Q)
start_time = timestamp_from_ffmpeg(start_time)
yield start_time
return max(start_times(streams()))
@property
def audio_format(self):
return self._audio_format
@audio_format.setter
def audio_format(self, value):
self._audio_format = value
if value is None:
self.audioq.clear()
The line you'll be interested in here is self._file = ffmpeg_open_filename(asbytes_filename(filename))
. This brings us here, once again in media/codecs/ffmpeg.py
:
def ffmpeg_open_filename(filename):
"""Open the media file.
:rtype: FFmpegFile
:return: The structure containing all the information for the media.
"""
file = FFmpegFile() # TODO: delete this structure and use directly AVFormatContext
result = avformat.avformat_open_input(byref(file.context),
filename,
None,
None)
if result != 0:
raise FFmpegException('Error opening file ' + filename.decode("utf8"))
result = avformat.avformat_find_stream_info(file.context, None)
if result < 0:
raise FFmpegException('Could not find stream info')
return file
and this is where things get messy: it calls to a ctypes function (avformat_open_input) that when given a file, will grab its details and fill out all the information it needs for our FFmpegSource class. With some work, you should be able to get avformat_open_input to take a bytes object rather than a path to a file which it will open to get the same information. I'd love to do this and include a working example, but I don't have the time right now. You'd then need to make a new ffmpeg_open_filename function utilizing the new avformat_open_input function, and then a new FFmpegSource class utilizing the new ffmpeg_open_filename function. All you need now is a new FFmpegDecoder class utilizing the new FFmpegSource class.
You could then implement this by adding it to your pyglet package directly. After, you'd want to add support for a byte object argument in the load() function (located in media/__init__.py
and override the decoder to your new one. And there, you would now be able to stream audio without saving it.
Or, you could simply use a package that already supports it. Python-vlc does. You could use the example here to play whatever audio you'd like from a link. If you aren't doing this just for a challenge, I would strongly recommend you use another package. Otherwise: good luck.