I'm looking for a drop in solution for caching large-ish amounts of data.
related questions but for different languages:
- Python Disk-Based Dictionary
- Disk-backed STL container classes?
Close question in different terms:
- Looking for a simple standalone persistant dictionary implementation in C#
I don't need (or want to pay anything for) persistence, transactions, thread safety or the like and want something that is not much more complex to use than a List<> or Dictionary<>.
If I have to write code, I'll just save everything off as files in the temp directory:
string Get(int i)
{
File.ReadAllText(Path.Combine(root,i.ToString());
}
In my cases in index will be an int
(and they should be consecutive or close enough) and the data will be a string
so I can get away with treating both a POD and would rather go ultra-light and do exactly that.
The usage is that I have a sequence of 3k files (as in file #1 to #3000) totaling 650MB and need to do a diff for each step in the sequence. I expect that to total about the same or a little more and I don't want to keep all that in memory (larger cases may come along where I just can't).
A number of people have suggested different solutions for my problem. However none seem to be targeted at my little niche. The reasons that I'm looking at disk backed caching is because I'm expecting that my current use will use up 1/3 to 1/2 of my available address space. I'm worried that larger cases will just flat run out of space. I'm not worried about treading, persistence or replication. What I'm looking for is a minimal solution using a minimum of code, a minimal usage foot print, minimal in memory overhead and minimum complexity.
I'm starting to think I'm being overly optimistic.
What you really want is a B-Tree.
That's the primary data structure that a database uses.
It's designed to enable you to efficiently swap portions of a data structure to and from disk as needed.
I don't know of any widely used, high quality standalone B-Tree implementations for C#.
However, an easy way to get one would be to use a Sql Compact database. The Sql Compact engine will run in-process, so you don't need a seperate service running. It will give you a b-tree, but without all the headaches. You can just use SQL to access the data.
Disclaimer - I am about to point you at a product that I am involved in.
I'm still working on the web site side of things, so there is not a lot of info, but Serial Killer would be a good fit for this. I have examples that use .Net serialization (can supply examples), so writing a persistent map cache for .Net serializable objects would be trivial.
Enough shameless self promotion - if interested, use the contact link on the website.
This is very similar to my question
Looking for a simple standalone persistant dictionary implementation in C#
I don't think a library that exactly fits what you want exists, maybe its time for a new project on github.
Here is a B-Tree implementation for .net: http://bplusdotnet.sourceforge.net/
you can use the MS application block with disk based cache solution
Try looking at NCache here also.
I am not affiliated with this company. I've just downloaded and tested their free express version.
I've partially poprted EhCache Java application to .NET The distributed caching is not yet implemented, but on a single node, all original UnitTests pass. Full OpenSource:
http://sourceforge.net/projects/thecache/
I can create a binary drop if you need it (only sourcecode is availble now)
I'd take the embedded DB route (SQLite, Firebird), but here are some other options:
- Berkeley DB: not your standard SQL embedded DB, but seems to be the right choice for this kind of task, although not easily usable from .net
- db4o: an OODB, very simple interface
I recommend the Caching Application block in the Enterprise Library from MS. That was recommended as well, but the link points to an article on the Data Access portion of the Enterprise Library.
Here is the link to the Caching Application Block:
http://msdn.microsoft.com/en-us/library/cc309502.aspx
And specifically, you will want to create a new backing store (if one that persists to disk is not there):
http://msdn.microsoft.com/en-us/library/cc309121.aspx
Given your recent edits to the question, I suggest that you implement the solution noted in your question as you are very unlikely to find such a naive solution wrapped up in a library for you to reuse.