I am looking for the most performant way to arrange usage of the datacache and datacache factory for AppFabric caching calls, for between 400 and 700 cache gets per page load (and barely any puts). It seems that using a single static DataCacheFactory (or possibly a couple in a round-robin setup) is the way to go.
Do I call GetCache("cacheName") for every DataCache object request, or do I make one static at the time DataCache factory is initialized and use that for all calls?
Do I have to handle exceptions, check for fail codes and attempt retries?
Do I have to consider contention when more than one thread tries to use the cache store and wants the same item (by key)?
Is there some kind of documentation which properly explores the design and usage of this?
Some information I have gathered so far from the forum:
http://social.msdn.microsoft.com/Forums/en-AU/velocity/thread/98d4f00d-3a1b-4d7c-88ba-384d3d5da915
"Creating the factory involves connecting to the cluster and can take some time. But once you have the factory object and the cache that you want to work with, you can simply reuse those objects to do puts and gets into the cache, and you should see much faster performance."
http://social.msdn.microsoft.com/Forums/en-US/velocity/thread/0c1d7ce2-4c1b-4c63-b525-5d8f98bb8a49
"Creating single DataCacheFactory (singleton) is more performing than creating multiple DataCacheFactory. you should not create DataCacheFactory for each call, it will have performance hit."
"Please try to encapsulate round-robin algorithm (having 3/4/5 factory instances) in your singleton and compare load-test results."
http://blogs.msdn.com/b/velocity/archive/2009/04/15/pushing-client-performance.aspx
"You can increase the number of clients to increase the cache throughput. But sometimes if you want to have smaller set of clients and increase throughput, a trick is to use multiple DataCacheFactory instances. The DataCacheFactory instance creates a connection to the servers (e..g if there are 3 servers, it will create 3 connections) and multiplexes all requests from the datacaches on to these connections. So if the put/get volume is very high, these TCP connections might be bottlenecked. So one way is to create multiple DataCacheFactory instances and then use the operations on them."
Here what is in use so far... the property is called and if the return value is not null an operation is performed.
private static DataCache Cache
{
get
{
if (_cacheFactory == null)
{
lock (Sync)
{
if (_cacheFactory == null)
{
try
{
_cacheFactory = new DataCacheFactory();
}
catch (DataCacheException ex)
{
if (_logger != null)
{
_logger.LogError(ex.Message, ex);
}
}
}
}
}
DataCache cache = null;
if (_cacheFactory != null)
{
cache = _cacheFactory.GetCache(_cacheName);
}
return cache;
}
}
See this question on Microsoft AppFabric forum: http://social.msdn.microsoft.com/Forums/en-AU/velocity/thread/e0a0c6fb-df4e-499f-a023-ba16afb6614f
Here is the answer from the forum post:
Hi. Sorry for the delayed response,
but I want to say that these are great
questions and will probably be useful
to others.
There shouldn't be a need for more
than one DataCacheFactory per thread
unless you are requiring different
configurations. For example, if you
programmatically configure the
DataCacheFactory with the
DataCacheFactoryConfiguration class,
then you might want to create one that
has local cache enabled and another
that does not. In this case, you would
use different DataCacheFactory objects
depending on the configuration you
require for your scenario. But other
than differences in configuration, you
should not see a performance gain by
creating multiple DataCacheFactories.
On the same subject, there is a
MaxConnectionsToServer setting (either
programmatic in
DataCacheFactoryConfiguration or in
the application configuration file as
an attribute of the dataCacheClient
element). This determines the number
of chennels per DataCacheFactory that
are opened to the cache cluster. If
you have high throughput requirements
and also available CPU/Network
bandwidth, increasing this setting to
3 or higher can increase throughput.
We don't recommend increasing this
without cause or to a value that is
too high for your needs. You should
change the value and then test your
scenario to observe the results. We
hope to have more official guidance on
this in the future.
Once you have a DataCacheFactory, you
do not need to call GetCache()
multiple times to get multiple
DataCache objects. Every call to
GetCache() for the same cache on the
same factory returns the same
DataCache object. Also, once you have
the DataCache object, you do not need
to continue to call DataCacheFactory
for it. Just store the DataCache
object and continue to use it.
However, do not let the
DataCacheFactory object get disposed.
The life of the DataCache object is
tied to the DataCacheFactory object.
You should never have to worry about
contention with Get requests. However,
with Put/Add requests, there can be
contention if multiple data cache
clients are updating the same key at
the same time. In this case, you will
get an exception with an error code of
ERRCA0017, RetryLater and a substatus
of ES0005, KeyLatched. However, you
can easily add exception handling and
retry logic to attempt the update
again when errors such as these occur.
This can be done for RetryLater codes
with various substatus values. For
more information, see
http://msdn.microsoft.com/en-us/library/ff637738.aspx.
You can also use pessimistic locking
by using the GetAndLock() and
PutAndUnlock() APIs. If you use this
method it is your responsibility to
make sure that all cache clients use
pessimistic locking. A Put() call will
wipe out an object that was previously
locked by GetAndLock().
I hope this helps. Like I said, we
hope to get this type of guidance into
some formal content soon. But it is
better to share it here on the forum
until then. Thanks!
Jason Roth
Do I call GetCache("cacheName") for
every DataCache object request, or do
I make one static at the time
DataCache factory is initialized and
use that for all calls?
I suppose really the answer should be; try it both ways and see if there's a difference, but one static DataCache seems to me to make more sense than a corresponding call to GetCache
for every call to Get.
That 'Pushing Client Performance' article suggests that there's a sweet spot where the number of DataCacheFactory instances gets you maximum performance beyond which the memory overhead starts working against you - it's a shame they didn't give any guidelines (or even a rule of thumb) on where this spot might be.
I haven't come across any documentation on maximising performance - I think AppFabric is still too new for these guidelines to have been shaken out yet. I did have a look in the Contents for the Pro AppFabric book, but it seems much more concerned generally with the workflow (Dublin) side of AppFabric rather than the caching (Velocity) piece.
One thing I would say though: is there any possibility for you to cache 'chunkier' objects so you can make fewer calls to Get
? Could you cache collections rather than individual objects and then unpack the collections on the client? 700 cache gets per page load seems to me to be a huge number!