Youtube Data API v3 pageToken for arbitrary page

2019-02-15 14:38发布

Another question on SO revealed that pageTokens are identical for different searches, provided that the page number and maxResults settings are the same.

Version 2 of the API let you go to any arbitrary page by setting a start position, but v3 only provides next and previous tokens. There's no jumping from page 1 to page 5, even if you know there are 5 pages of results.

So how do we work around this?

3条回答
Luminary・发光体
2楼-- · 2019-02-15 14:55

YouTube's pagetokens can be treated as indices.

  • Pagetokens for the first 1000 items can be found here.
  • Pagetokens for every 10th item in range(1, 100000) can be found here.
  • The highest available pagetoken is "CJ-NBhAA" which points to the 100.000th item with position 99.999.
  • The highest possible value for maxresults is 50.

Use pagetoken to specify a starting point and maxresults to specify the number of items.

Examples:

  • 1st item

https://www.googleapis.com/youtube/v3/playlistItems?part=id%2Csnippet&playlistId=<PLAYLISTID>&key=<APIKEY>&maxResults=1&pageToken=CAAQAA

  • 555th item

https://www.googleapis.com/youtube/v3/playlistItems?part=id%2Csnippet&playlistId=<PLAYLISTID>&key=<APIKEY>&maxResults=1&pageToken=CKoEEAA

  • 99999th item

https://www.googleapis.com/youtube/v3/playlistItems?part=id%2Csnippet&playlistId=<PLAYLISTID>&key=<APIKEY>&maxResults=1&pageToken=CJ6NBhAA

  • 10 items starting at 10th item

https://www.googleapis.com/youtube/v3/playlistItems?part=id%2Csnippet&playlistId=<PLAYLISTID>&key=<APIKEY>&maxResults=10&pageToken=CAkQAA

  • 30 items starting at 555th item

https://www.googleapis.com/youtube/v3/playlistItems?part=id%2Csnippet&playlistId=<PLAYLISTID>&key=<APIKEY>&maxResults=30&pageToken=CKoEEAA

  • 50 items starting at 9999th item

https://www.googleapis.com/youtube/v3/playlistItems?part=id%2Csnippet&playlistId=<PLAYLISTID>&key=<APIKEY>&maxResults=50&pageToken=CI9OEAA

查看更多
The star\"
3楼-- · 2019-02-15 15:01

A YouTube pageToken is six characters long. Here's what I've been able to determine about the format:

char 1: Always 'C' that I've seen. char 2-3: Encoded start position char 4-5: Always 'QA' that I've seen. char 6: 'A' means list items in a position greater than or equal to the start position. 'Q' means list items before the start position.

Due to the nature of character 6, there are two different ways to represent the same page. Given maxResults=1, page 2 can be reached by setting the page token to either "CAEQAA" or "CAIQAQ". The first one means to start at result number 2 (represented by characters 2-3 "AE") and list 1 item. The second means to return one item before result number 3 (represented by characters 2-3 "AI".

Characters 2-3 are a strange base 16 encoding.

Character 3 uses a list from A-Z, then a-z, then 0-9 and increments by 4 in the list for each increase of 1. The series is A,E,I,M,Q,U,Y,c,g,k,o,s,w,0,4,8. Character 2 goes from A to B to C to D and so on. For my purposes, I'm not working with large result sets, so I haven't bothered to see what happens to the second character beyond a couple hundred results. Perhaps someone working with larger sets will provide an update as to how character 2 behaves after that.

Since the string only contains a start position and an option for ">=" or "<", the same string is used in multiple cases. For instance, with 2 results per page, the start position of the second page is result 3. The pageToken for this is "CAIQAA". This is identical to the token for the third page with one result per page.

Since I'm primarily a php person, here's the function I'm using to get the pageToken for a given page:

function token($limit, $page) {
    $start = 1 + ($page - 1) * $limit;
    $third_chars = array_merge(
            range("A","Z",4),
            range("c","z",4),
            range(0,9,4));
    return 'C'.
           chr(ord('A') + floor($start / 16)).
           $third_chars[($start % 16) - 1].
           'QAA';
}
$limit = 1;
echo "With $limit result(s) per page...".PHP_EOL;
for ($i = 1; $i < 6; ++$i) {
    echo "The token for page $i is ".token($limit, $i).PHP_EOL;
}

Please test this function in your project and update the rest of us if you find a flaw or an enhancement since YouTube hasn't provided us with an easy way to do this.

查看更多
女痞
4楼-- · 2019-02-15 15:14

Using ^ Quihico's files as a reference point, I had a little fun writing an enhancement to the previous poster's pageToken generator, in JS. If my assumption is correct about how the 4000s place encoding varies past N >= 98304, it should be able to construct a pageToken for a page starting with Nth item, provided N in [0, 4194304). It's only tested up to N = 99999, so YMMV.

Link here: https://github.com/aricearice/youtube-page-token/blob/master/index.js

查看更多
登录 后发表回答