How can I optimize my FQL to avoid Facebook timeou

2020-06-03 06:46发布

问题:

Let's take a simple FQL query to get all links shared by a user's friends since yesterday for example:

SELECT link_id, title, url, owner, created_time
FROM link
WHERE
    created_time > strtotime('yesterday') AND
    owner IN (
        SELECT uid2 FROM friend WHERE uid1 = me()
    )
LIMIT 100

If a user has 50 friends, this will execute perfectly. But if a user has hundreds of friends, more often than not Facebook returns an error.

Options:

  1. Limit the friend select query to 50 -- Sure, that will work, but it will show the same friends every time. Unless you want an Ivy League-only feed, this isn't very helpful.
  2. Batch Queries -- Create a batch of queries using offsets and limit each to 50. Unfortunately there's no improvement here either.
  3. Loop It -- So far this is the best I've found. Loop through the same queries you built for a batch query, but do it one at a time with multiple api fql query calls. But even this is hit and miss.

How can I query Facebook appropriately to ensure successful results?

Notes:

  • I'm using the latest Facebook php sdk, 3.1.1
  • I've also tried extending the default options for curl timeouts in the base_facebook.php

Common Errors associated with Timeout:

1.

Fatal error:  Uncaught Exception: 1: An unknown error occurred thrown in /..../facebook/php-sdk/src/base_facebook.php on line 708

line 708 is an exception error:

// results are returned, errors are thrown
if (is_array($result) && isset($result['error_code'])) { 
    throw new FacebookApiException($result);
}

2.

Fatal error: Uncaught CurlException: 52: SSL read: error:00000000:lib(0):func(0):reason(0), errno 104 thrown in /..../facebook/php-sdk/src/base_facebook.php on line 814

回答1:

You should loop through using limit/offset like you said, or cache the friends list up front as puffpio suggested.

You said that it still wasn't working reliably - this is because some users may have many, many links, while others not so many. Note also that you may be retrieving uncached data for some users. I would recommend having a single retry in your loop for failed queries - it's often the case that the first one will time out and the second one will succeed due to newly cached data.

Finally, for posterity, I'm opening a task to optimize the link table to do a better job of being efficient when it's being filtered by time.



回答2:

Some db engines do not optimize the IN keyword well, or at all. They may be executing the in clause for every single resulting row of your query. Can you join the link and friend tables instead of using an IN with a subquery?

You may find this article interesting. (Discusses issues with IN clause performance on MySQL and Facebook runs MySQL on the back end.)



回答3:

It would be better to cache the user's friends and only refresh it occasionally. In other words, run this query

SELECT uid2
FROM friend
WHERE uid1 = me()

Cache the list of users and run

SELECT link_id, title, url, owner, created_time
FROM link
WHERE
    created_time > strtotime('yesterday') AND
    owner IN (/*your user list here*/)
LIMIT 100

This way you are not running the inner query all the time. In reality a user's friend list does not have a high churn rate, so you would not need to update it as frequently as getting the share links.

Additionally, architecting it this way will allow you to break up the 2nd query into multiple queries with different sets of 'owner's and then using fql.multiquery to get them all simultaneously