How can I use callbacks with Perl's Text::Docu

2019-08-04 19:07发布

#!/usr/local/bin/perl
use warnings;
use 5.012;
use Text::Document;
use Text::DocumentCollection;

my $c = Text::DocumentCollection->new( file => 'coll.db'  );

my $doc_one = Text::Document->new( lowercase => 0, compressed => 0 );
my $doc_two = Text::Document->new( lowercase => 0, compressed => 0 );
my $doc_three = Text::Document->new( lowercase => 0, compressed => 0 );

$doc_one->AddContent( 'foo bar biz buu muu muu' );
$doc_two->AddContent( 'foo foofoo Foo foo' );
$doc_three->AddContent( 'one two three foo foo' );

$c->Add( 'key_one', $doc_one );
$c->Add( 'key_two', $doc_two );
$c->Add( 'key_three', $doc_three );

Could someone show me a sensible and understandable Callback-function-example?

#!/usr/local/bin/perl
use warnings;
use 5.012;
use Text::Document;
use Text::DocumentCollection;

my $c = Text::DocumentCollection->NewFromDB( file => 'coll.db' );

my @result = $c->EnumerateV( \&Callback, 'the rock' );
say "@result";

sub Callback {
    ...
    ...
}

# The function Callback will be called on each element of the collection as:
#  my @l = CallBack( $c, $key, $doc, $rock );
# where $rock is the second argument to Callback.
# Since $c is the first argument, the callback may be an instance method of Text::DocumentCollection.
# The final result is obtained by concatenating all the partial results (@l in the example above). 
# If you do not want a result, simply return the empty list ().

1条回答
成全新的幸福
2楼-- · 2019-08-04 20:08

Inside the EnumerateV function, the callback function gets called for every document in the collection, and the return values of each callback function call are collected and returned. There's probably a pretty simple and equivalent way to write this using the map function.

In any case, here's an example callback function for your sample data:

sub document_has_twice {
    # return document key if term appears twice in the document
    my ($collection_object, $key, $document, $search_term) = @_;
    if ($document->{terms}{$search_term}
            && $document->{terms}{$search_term} >= 2) {
        return $key;
    }
    return;
}

my @r = $c->EnumerateV( \&document_has_twice, "foo");
print "These documents contain the word 'foo' at least twice: @r\n";

@r = $c->EnumerateV( \&document_has_twice, "muu");
print "These documents contain the word 'muu' at least twice: @r\n";

@r = $c->EnumerateV( \&document_has_twice, "stackoverflow");
print "These documents contain the word 'stackoverflow' at least twice: @r\n";

Output:

These documents contain the word 'foo' at least twice: key_three key_two
These documents contain the word 'muu' at least twice: key_one
These documents contain the word 'stackoverflow' at least twice:
查看更多
登录 后发表回答