How do I retrieve individual video URLs from a pla

2020-05-29 10:54发布

问题:

I'm trying to:

  • Use youtube-dl within a Python script to download videos periodically
  • Organize/name the videos dynamically by youtube data i.e. %(title)s
  • Extract audio/MP3 and move those files into a subdirectory named 'MP3'

I'm pretty new to Python, and I'm sure there are some messy, unnecessary bits of code; so, I'm open to clean-up advice too.

I have encountered an issue where, when I enter a playlist url (instead of an individual url) I'm just getting the playlist name instead of the individual title, uploader data I used to sort files. (And I don't know how or if I can just use the outmpl option/variable throughout the code)

I actually split the code into three pieces/modules.

The basic example of the problem is this - I enter:

outmpl: 'F:\\Videos\\Online Videos\\Comics\\%(uploader)s\\%(playlist)s\\%(playlist_index)s_%(title)s.%(ext)s'

Which would save the videos to:

'F:\\Videos\\Online Videos\\Comics\\Comicstorian\\Annhililation\\01_Annihilation Beginnings Drax Earthfall - Complete Story.mp4' - and so on (for the rest of the videos)

But, I don't know to pass the directory variables on to the module where I move the files.

Here is the code - the three modules/parts


PyFile_Download_Exmple.py

from __future__ import unicode_literals
import youtube_dl
import Move_MP3
import ytdl_variables

#Uses variables from ytdl_variables script and downloads the video

with youtube_dl.YoutubeDL(ytdl_variables.ydl_opts) as ydl:
    ydl.download([ytdl_variables.video_url])

#Calls script to create folder and move MP3 files
Move_MP3

ytdl_variables.py

from __future__ import unicode_literals
import youtube_dl

global video_title, uploader, playlist, playlist_index, video_url, ydl_opts, ydl

video_url = 'https://www.youtube.com/playlist?list=PL6FhCd_HO_ACJzTiLKfETgzLc1an_t05i'


ydl_opts = {
    'format': 'bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best',
    'outtmpl': 'F:\\Videos\\Online Videos\\Comics\\%(uploader)s\\%(playlist)s\\%(playlist_index)s_%(title)s.%(ext)s',
    'postprocessors': [{
        'key': 'FFmpegExtractAudio',
        'preferredcodec': 'mp3',
        'preferredquality': '192',
    }],
    'download_archive': 'F:\\Videos\\Online Videos\\Archive.txt',
}

with youtube_dl.YoutubeDL(ydl_opts) as ydl:
        #The next part creates a variable that returns info when provided the video_url variable >> http://stackoverflow.com/questions/23727943/how-to-get-information-from-youtube-dl-in-python
        '''
            Code here should get take the youtube playlist and spit out
            each url to move to the next step as vLinks variable, but
            I haven't figured out how to pass (title) etc. variables from 
           each video in a playlist.



  link = individual video url from playlist

The following puts actual info into variables for Python to use. These are made global above. I made a 'for' loop to repeat grabbing info for each video - but it doesn't work right now b/c I didn't define vLinks.
'''
    for vLink in vLinks:
        info_dict = ydl.extract_info(link, download=False)
        video_title = info_dict.get('title', None)
        playlist_index = info_dict.get('playlist_index', None)
        playlist = info_dict.get('playlist', None)
        uploader = info_dict.get('uploader', None)
        print(video_title)

#Checks if the video is in a playlist; if it's not, 'NA' will be the string returned: http://stackoverflow.com/questions/23086383/how-to-test-nonetype-in-python

if playlist is None:
    playlist = 'NA'

if playlist_index is None:
    playlist_index = 'NA'

Move_MP3

from __future__ import unicode_literals
import ytdl_variables
import shutil
import os, os.path

#Sets variables for renaming the files
newfolder = 'F:\\Videos\\Online Videos\\Comics\\' + ytdl_variables.uploader + '\\' + ytdl_variables.playlist + '\\MP3\\'
oa_savedir = 'F:\\Videos\\Online Videos\\Comics\\' + ytdl_variables.uploader + '\\' + ytdl_variables.playlist + '\\' + ytdl_variables.playlist_index + '_' + ytdl_variables.video_title + '.mp3'
fa_savedir = 'F:\\Videos\\Online Videos\\Comics\\' + ytdl_variables.uploader + '\\' + ytdl_variables.playlist + '\\MP3\\' + ytdl_variables.playlist_index + '_' + ytdl_variables.video_title + '.mp3'

#Function that creates file directory before moving file there - changed from http://stackoverflow.com/questions/23793987/python-write-file-to-directory-doesnt-exist
def mkdir_p(path):
    if not os.path.exists(path):
        os.makedirs(path);

#Function that checks whether the file already exists where I want to move it >> http://stackabuse.com/python-check-if-a-file-or-directory-exists/
def chkfl_p(path):
    if not os.path.isfile(path):
        shutil.move(oa_savedir, fa_savedir);

#Calls function to look for \MP3 directory and creates directory if it doesn't exist
mkdir_p(newfolder)
#Calls function to look for file and moves file if it isn't already there
chkfl_p(fa_savedir)

回答1:

I'm working on cleaning this up so more, but I discovered the answer in a section of another answer...

To get individual links from a playlist url:

ydl = youtube_dl.YoutubeDL({'outtmpl': '%(id)s%(ext)s', 'quiet':True,})
video = ""

with ydl:
    result = ydl.extract_info \
    (yt_url,
    download=False) #We just want to extract the info

    if 'entries' in result:
        # Can be a playlist or a list of videos
        video = result['entries']

        #loops entries to grab each video_url
        for i, item in enumerate(video):
            video = result['entries'][i]

the youtube_dl.YoutubeDL seems to return JSON data from the YouTube API yt_url is a variable for either a video or playlist.

If the returned data has "entries" it's a playlist - then I loop those each entry (enumerate entries with i(ndex)) -- from there I can do what I want with the urls or other info.

result['entries'][i]['webpage_url']     #url of video
result['entries'][i]['title']           #title of video
result['entries'][i]['uploader']        #username of uploader
result['entries'][i]['playlist']        #name of the playlist
result['entries'][i]['playlist_index']  #order number of video