Renew WordPress Feed Cache in the Background

2019-09-06 18:31发布

问题:

I'm looking for a way to refresh feed caches in the background.

To demonstrate the issue I'm facing with, the below code would help. It renews the cache in every 30 seconds when the page is accessed and loaded. Since it has lots of urls to fetch at once, it gets really slow when the cache needs to be rebuild.

$urls = array(
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=w&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=n&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=b&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=el&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=tc&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=ir&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=s&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=snc&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=m&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=e&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&q=topic:bagram&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&q=topic:syria&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&q=topic:baghdad&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&q=topic:bernard_arnault&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&q=topic:senkaku_islands&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&q=topic:alps&output=rss'
    );

    $feed = fetch_feed_modified($urls);
    foreach ($feed->get_items() as $item):
    ?>

        <div class="item">
            <h2><a href="<?php echo $item->get_permalink(); ?>"><?php echo $item->get_title(); ?></a></h2>
            <p><?php echo $item->get_description(); ?></p>
            <p><small>Posted on <?php echo $item->get_date('j F Y | g:i a'); ?></small></p>
        </div>

    <?php endforeach; 

function fetch_feed_modified($url) {
    require_once (ABSPATH . WPINC . '/class-feed.php');

    $feed = new SimplePie();
    $feed->set_feed_url($url);
    $feed->set_cache_class('WP_Feed_Cache');
    $feed->set_file_class('WP_SimplePie_File');
    $feed->set_cache_duration(apply_filters('wp_feed_cache_transient_lifetime', 30, $url)); // set the cacne timeout to 30 seconds
    do_action_ref_array( 'wp_feed_options', array( &$feed, $url ) );
    $feed->init();
    $feed->handle_content_type();

    if ( $feed->error() )
        return new WP_Error('simplepie-error', $feed->error());

    return $feed;
}   

So I'm wondering how I can modify this so that it silently renews the cache in the background when it hits the timeout. I mean it shows the page normally with the saved cache although the timeout exceeds; on the other hand, it starts building a new cache in the background after the access. This way the visitor never sees the page being slow.

Is it possible?

回答1:

Okay, this works.

<?php
/* Plugin Name: Sample Feed Cache Renew Crawler */

    $urls = array(
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=w&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=n&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=b&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=el&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=tc&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=ir&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=s&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=snc&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=m&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&topic=e&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&q=topic:bagram&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&q=topic:syria&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&q=topic:baghdad&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&q=topic:bernard_arnault&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&q=topic:senkaku_islands&output=rss',
        'http://news.google.com/news?pz=1&cf=all&ned=us&hl=en&q=topic:alps&output=rss'
    );
    $cache_renew_interval = 30; // every thirty seconds

    // admin page
    add_action('admin_menu', 'sample_feed_cache_renew_crawler_menu');
    function sample_feed_cache_renew_crawler_menu() {
        add_options_page(
            'Sample Feed Cache Renew Crawler', 
            'Sample Feed Cache Renew Crawler', 
            'manage_options',
            'sample_feed_cache_renew_crawler', 
            'sample_feed_cache_renew_crawler_admin');
    }
    function sample_feed_cache_renew_crawler_admin() {
        global $urls, $cache_renew_interval;
        ?>
        <div class="wrap">
        <?php       

            $feed = fetch_feed_with_custom_lifetime($urls, 60*60*24 );  // lifetime for 24 hours

            if ( $feed->error() )
                return new WP_Error('simplepie-error', $feed->error());         

            $feed = fetch_feed($urls);

            $i = 0;
            foreach ($feed->get_items() as $item):  
                if (++$i==20) break;
            ?>

                <div class="item">
                    <h2><a href="<?php echo $item->get_permalink(); ?>"><?php echo $item->get_title(); ?></a></h2>
                    <p><?php echo $item->get_description(); ?></p>
                    <p><small>Posted on <?php echo $item->get_date('j F Y | g:i a'); ?></small></p>
                </div>

            <?php endforeach;    
        ?>
        </div>      
        <?php
        wp_clear_scheduled_hook( 'sample_feed_cache_renew_crawler_event' );
        add_action('sample_feed_cache_renew_crawler_event','sample_feed_cache_renew_crawler_function');
        wp_schedule_single_event(time() + $cache_renew_interval, 'sample_feed_cache_renew_crawler_event');

}
// wp_clear_scheduled_hook( 'sample_feed_cache_renew_crawler_event' );
require_once (ABSPATH . WPINC . '/class-feed.php');
function fetch_feed_with_custom_lifetime($url, $lifetime) {
    $feed = new SimplePie();
    $feed->set_feed_url($url);
    $feed->set_cache_class('WP_Feed_Cache');
    $feed->set_file_class('WP_SimplePie_File');
    $feed->set_cache_duration(apply_filters('wp_feed_cache_transient_lifetime', $lifetime, $url)); // set the cacne timeout to 30 seconds
    do_action_ref_array( 'wp_feed_options', array( &$feed, $url ) );
    $feed->init();
    $feed->handle_content_type();
    if ( $feed->error() ) return new WP_Error('simplepie-error', $feed->error());
    return $feed;
}   

add_action('sample_feed_cache_renew_crawler_event','sample_feed_cache_renew_crawler_function');
function sample_feed_cache_renew_crawler_function() {
    $file = __DIR__ . '/log.txt';
    $current = date('l jS \of F Y h:i:s A') . ": cache cleared" . PHP_EOL;
    file_put_contents($file, $current, FILE_APPEND);

    global $urls, $cache_renew_interval;
    fetch_feed_with_custom_lifetime($urls, 0);  // renew the cache right away
    wp_schedule_single_event(time() + $cache_renew_interval, 'sample_feed_cache_renew_crawler_event');

}

One thing that is not clear to me is that even though I set the interval to 30 seconds, it's not always calling the function, sample_feed_cache_renew_crawler_function() in the right time. The log file tells that sometimes it takes 2 minutes and sometimes 4 minutes although I kept pressing the reload button of the browser for more than those minutes.

According to Codex, http://codex.wordpress.org/Function_Reference/wp_schedule_single_event,

Note that scheduling an event within 10 minutes of an event of the same name will be ignored, unless you pass unique values for $args to each scheduled event.

But the log file tells function was called in an interval of 2 minutes or so. So it doesn't make sense.