Choosing a PHP caching technique: output caching i

2020-03-02 02:54发布

问题:

I've heard of two caching techniques for the PHP code:

  1. When a PHP script generates output it stores it into local files. When the script is called again it check whether the file with previous output exists and if true returns the content of this file. It's mostly done with playing around the "output buffer". Somthing like this is described in this article.

  2. Using a kind of opcode caching plugin, where the compiled PHP code is stored in memory. The most popular of this one is APC, also eAccelerator.

Now the question is whether it make any sense to use both of the techniques or just use one of them. I think that the first method is a bit complicated and time consuming in the implementation, when the second one seem to be a simple one where you just need to install the module.

I use PHP 5.3 (PHP-FPM) on Ubuntu/Debian.

BTW, are there any other methods to cache PHP code or output, which I didn't mention here? Are they worth considering?

回答1:

You should always have an opcode cache like APC. Its purpose is to speed up the parsing of your code, and will be bundled into PHP in a future version. For now, it's a simple install on any server and doesn't require you write or change any code.

However, caching opcodes doesn't do anything to speed up the actual execution of your code. Your bottlenecks are usually time spent talking to databases or reading to/from disk. Caching the output of your program avoids unnecessary resource usage and can speed up responses by orders of magnitude.

You can do output caching many different ways at many different places along your stack. The first place you can do it is in your own code, as you suggested, by buffering output, writing it to a file, and reading from that file on subsequent requests.

That still requires executing your PHP code on each request, though. You can cache output at the web server level to skip that as well. Crafting a set of mod_rewrite rules will allow Apache to serve the static files instead of the PHP code when they exist, but you'll have to regenerate the cached versions manually or with a scheduled task, since your PHP code won't be running on each request to do so.

You can also stick a proxy in front of your web server and use that to cache output. Varnish is a popular choice these days and can serve hundreds of times more request per second with caching than Apache running your PHP script on the same server. The cache is created and configured at the proxy level, so when it expires, the request passes through to your script which runs as it normally would to generate the new version of the page.



回答2:

You know, for me, optcache , filecache .. etc only use for reduce database calls. They can't speed up your code. However, they improve the page load by using cache to serve your visitors.

With me, APC is good enough for VPS or Dedicated Server when I need to cache widgets, $object to save my mySQL Server.

If I have more than 2 Servers, I like to used Memcache , they are good on using memory to cache. However it is up to you, not everyone like memcached, and not everyone like APC.

For caching whole web page, I ran a lot of wordpress, and I used APC, Memcache, Filecache on some Cache Plugins like W3Total Cache. And I see ( my own exp ): Filecache is good for caching whole website, memory cache is good for caching $object

Filecache will increase your CPU if your hard drive is slow, and Memory cache is terrible if you don't have enough memory on your VPS.

An SSD HDD will be super good speed to read / write file, but Memory is always faster. However, Human can't see what is difference between these speed. You only pick one method base on your project and your server ( RAM, HDD ) or are you on a shared web hosting?

If I am on a shared hosting, without root permission, without php.ini, I like to use phpFastCache, it a simple file cache method with set, get, stats, delete only.

In Addition, I like to use .htaccess to cache static files like images, js, css or by html headers. They will help visitors speed up your page, and save your server bandwidth.

And If you can use .htaccess to redirect to static .html cache if you cache whole page is a great thing.

In future, APC or some Optcache will be bundle into PHP version, but I am sure all the cache can't speed up your code, they use to:

  1. Reduce Database / Query calls.
  2. Improve the speed of page load by use cache to serve.
  3. Save your API Transactions ( like Bing ) or cURL request...

etc...



回答3:

A lot of times, when it comes to PHP web applications, the database is the bottleneck. As such, one of the best things you can do is to use memcached to cache results in memory. You can also use something like xhprof to profile your code, and really dial in on what's taking the most time.



回答4:

Yes, those are two different cache-techniques, and you've understood them correctly.

but beware on 1):

1.) Caching script generated output to files or proxies may render problems if content change rapidly.

2.) x-cache exists too and is easy to install on ubuntu.

regards, /t



回答5:

I don't know if this really would work, but I came across a performance problem with a PHP script that I had. I have a plain text file that stores data as a title and a URL tab separated with each record separated by a new line. My script grabs the file at each URL and saves it to its own folder.
Then I have another page that actually displays the local files (in this case, pictures) and I use a preg_replace() to change the output of each line from the remote url to a relative one so that it can be displayed by the server. My tab separated file is now over 1 MB and it takes a few SECONDS to do the preg_replace(), so I decided to look into output caching. I couldn't find anything definitive, so I figured I would try my own hand at it and here's what I came up with:

When I request the page to view stuff locally, I try to read it from a variable in a global scope. If this is empty, it might be that this application hasn't run yet and this global needs populated. If it was empty, read from an output file (plain html file that literally shows everything to output) and save the contents to the global variable and then display the output from the global.
Now, when the script runs to update the tab separated file, it updates the output file and the global variable. This way, the portion of the script that actually does the stuff that runs slowly only runs when the data is being updated.

Now I haven't tried this yet, but theoretically, this should improve my performance a lot, although it does actually still run the script, but the data would never be out of date and I should get a much better load time.

Hope this helps.