I am working on a project involving anaphora resolution via Hobbs algorithm. I have parsed my text using the Stanford parser, and now I would like to manipulate the nodes in order to implement my algorithm.
At the moment, I don't understand how to:
Access a node based on its POS tag (e.g. I need to start with a pronoun - how do I get all pronouns?).
Use visitors. I'm a bit of a noob of Java, but in C++ I needed to implement a Visitor functor and then work on its hooks. I could not find much for the Stanford Parser's Tree structure though. Is that jgrapht? If it is, could you provide me with some pointers at code snippets?
Here's a simple example that parses a sentence and finds all of the pronouns.
This prints:
@dhg's answer works fine, but here are two other options that it might also be useful to know about:
The
Tree
class implementsIterable
. You can iterate through all the nodes of aTree
, or, strictly, the subtrees headed by each node, in a pre-order traversal, with:You can also get just nodes that satisfy some (potentially quite complex pattern) by using
tregex
, which behaves rather likejava.util.regex
by allowing pattern matches over trees. You would have something like: