How to make looped function calls for pagination?

2019-08-08 16:28发布

问题:

I am attempting to use the Airtable API to retrieve records from my data there - specifically, a list of URLs I have in column cells.

I wrote a function, get_airtable_records, to do the API call via curl and it works - returning results as a Json object. Specifically, I am pushing the URLs to an array, $article_urls.

The only problem is, Airtable limits the return of results to "pages" of a maximum 100 records, and my data contains more than that. The API accepts parameters maxRecords and pageSize but the more important one, pageSize, is still capped at 100.

What Airtable does also return is another Json value, offset, which is used in such cases for pagination. The offset is a record ID intended to be used as an input parameter (also called offset). You can use it to denote the starting record in a subsequent additional API call. I understand this.

What I don't understand is how to modify my code to account for the possibility of needing to poll Airtable again.

In other words, we should always do a starting run from scratch, when there is no offset value. Then, if an offset value is present in returned results, we should go around again - until an offset value is not present.

Here is what I have.

  // Make get request, store result in array
  $articles = get_airtable_records($offset); // $offset won't exist at the start

  // Prepare Article URLs list as an array
  if (!isset($article_urls)) {
     $article_urls = array();
  }

  // For each URL found in Airtable
  foreach($articles['records'] as $record){
     $url = $record['fields']['Published URL'];
     // Add to our array list
     if (!empty($url)) {
        array_push($article_urls, $url);
     }
  }

  // URL list after first pass:
  echo '<pre>';
  print_r($article_urls);
  echo '</pre>';
  // May hit a max of 100

  // echo 'Offset: ' . $articles['offset'];
  // Value like "itrJYSLx0RfslI80f/recEu6TiPTPCSDxg5" may exist.
  // If so, go back to start, do get_airtable_records($offset) again and array_push
  // Until, once more there is no "offset" value at end

I am speculating that some sort of while loop will be useful... ?

A couple of things are true...

  • In the first call, there will be no originating offset value needing to be passed, since it starts from record 0.
  • But that and subsequent passes may generate an offset value, which should be used to make another pass.
  • The final call will not generate an offset value, since it will have returned the final page of exhausted results, and there is no need to start again.

回答1:

Thanks largely to @anthony's answer to a similar question here, I seem to have some working code...

  // Prepare Article URLs list as an array
  $article_urls = array();

  // Call Airtable records in pages of 100 max
  do {

        // Offset is either inherited from last page's results, or is nothing
        $offset = $articles['offset'] ?: "";

        // Make get request, store result in array
        $articles = get_airtable_records($offset);

        // For each URL found in Airtable
        foreach($articles['records'] as $record){
           $url = $record['fields']['Published url'];
           // Add to our array list
           if (!empty($url)) {
              array_push($article_urls, $url);
           }
        }

  } while(!empty($articles['offset'])); // If there's an offset value (ie. starting record of next page), do again

  // Output URL list for check
  echo '<pre>';
  print_r($article_urls);
  echo '</pre>';

Explanation seems to be:

Use a do while loop. At the start of this, set offset to be either the value inherited from the previous run, or nothing.

My get_airtable_records function was already limiting the presence or not of offset in the API call, with the following, which adds the offset query string to the URL for the next API call if one is present...

  if (!empty($offset)) {
    $q_offset = '&offset='.$offset;
  }

I have tested this and it gave me all 137 results from two pages in to my $article_urls array. I haven't tested it with any more than two pages of results.