Simple, modern, robust, transparent persistence of

2020-07-03 07:21发布

问题:

I'm looking for a solution to transparently persist Perl data structures (not even objects, but object support would be a plus) without circular references. I don't care that much about the backend, but I'd prefer JSON. The number of objects would be relatively low (a few thousands of hashrefs with about 5 keys each). By "transparent" persistence I mean that I don't want to have to commit changes to the storage backend every time I update the in-memory data structure.

Here's how the code would ideally look like:

my $ds;

...
# load the $ds data structure from 'myfile'

print $ds->{foo}->{bar};  # baz
$ds->{foo}->{bar} = 'quux';

... program dies, but the updated %hash has been persisted automatically in 'myfile'

# in another invocation
print $ds->{foo}->{bar};  # quux

So far I've looked at:

  • Dave Rolsky's Perl Object-Oriented Persistence compilation of modules - no updates since 2003
  • brian d foy's MasteringPerl - Chapter 14. Data Serialization - talks about DBM::Deep, a good candidate. I wish there were a clearer difference between serialization and transparent persistence.
  • Persistent - no updates since 2000
  • SPOPS - abandoned since 2004
  • SLOOPS only has one version on CPAN, from 2005
  • Tangram - looks abandoned too
  • Tie::File::AsHash does transparent persistence, but only supports single-level hashes
  • MooseX::Storage, Storable and JSON look nice, but they're only serialization, not persistence frameworks
  • DBIx::Class, Class::DBI, Fey::ORM, ORM, Rose::DB are OO-RDBM mappers, and I'd rather not use a database backend
  • DB_File requires BerkeleyDB
  • KiokuDB seems too complex for the task

I've only found one promising module, DBM::Deep. The code is just like in the example, and you can load the data structure with

my $ds = DBM::Deep->new( "myfile.db" );

The format is binary, though. Not a big problem, since I can use JSON to export it in a human-readable format.

So, am I missing a module, and am I approaching the problem correctly at all?

回答1:

To achieve your "transparency" goal, you're going to have to either abstract it into a framework (as chambwez suggested) or use tied variables which will save themselves to disk whenever they're updated. DBM hashes use tie in this way, so DBM::Deep is probably your best bet; everything else I'm aware of requires you to explicitly tell it when to write data out and/or caches writes in the name of performance.



回答2:

Why not use JSON? It's rather easy (unless I misunderstood your question), all you would do is this:

use JSON;
# serialize to file
open(my $fh, ">myfile");
print $fh encode_json($ds); 
close $fh;
# deserialize from file
open(my $fh, "<myfile");
local $/ = undef;
my $content = <$fh>;
$ds = decode_json($content);
close $fh;

Another easy thing you can do is use Data::Dumper.



回答3:

I don't think transparent persistence is very good idea. Suppose you have hypothetical implementation that ties perl data structure to outside world. To be transparent, every write into the structure have to be detected and data outside updated. This is probably going to be quite expensive and end with a lot of disk activity unless you have sophisticated backend with fast random access. I cannot imagine updates of JSON file be efficient.

Some options:

  • use database backend (DBM::Deep, DB_File or KiokuDB)
  • use key-value store as backend (Memcached, Redis)
  • define consistent workflow on data and serialize/deserialize in good moment