Get certain nodes out of a Parse Tree

I am working on a project involving anaphora resolution via Hobbs algorithm. I have parsed my text using the Stanford parser, and now I would like to manipulate the nodes in order to implement my algorithm.

At the moment, I don't understand how to:

Access a node based on its POS tag (e.g. I need to start with a pronoun - how do I get all pronouns?).
Use visitors. I'm a bit of a noob of Java, but in C++ I needed to implement a Visitor functor and then work on its hooks. I could not find much for the Stanford Parser's Tree structure though. Is that jgrapht? If it is, could you provide me with some pointers at code snippets?

标签： java nlp stanford-nlp jgrapht

2条回答

神经病院院长

2楼-- · 2019-04-01 18:43

Here's a simple example that parses a sentence and finds all of the pronouns.

private static ArrayList<Tree> findPro(Tree t) {
    ArrayList<Tree> pronouns = new ArrayList<Tree>();
    if (t.label().value().equals("PRP"))
        pronouns.add(t);
    else
        for (Tree child : t.children())
            pronouns.addAll(findPro(child));
    return pronouns;
}

public static void main(String[] args) {

    LexicalizedParser parser = LexicalizedParser.loadModel();
    Tree x = parser.apply("The dog walks and he barks .");
    System.out.println(x);
    ArrayList<Tree> pronouns = findPro(x);
    System.out.println("All Pronouns: " + pronouns);

}

This prints:

    (ROOT (S (S (NP (DT The) (NN dog)) (VP (VBZ walks))) (CC and) (S (NP (PRP he)) (VP (VBZ barks))) (. .)))
    All Pronouns: [(PRP he)]

0人赞添加讨论(0) 举报

疯言疯语

3楼-- · 2019-04-01 19:03

@dhg's answer works fine, but here are two other options that it might also be useful to know about:

The Tree class implements Iterable. You can iterate through all the nodes of a Tree, or, strictly, the subtrees headed by each node, in a pre-order traversal, with:
```
for (Tree subtree : t) { 
    if (subtree.label().value().equals("PRP")) {
        pronouns.add(subtree);
    }
}
```
You can also get just nodes that satisfy some (potentially quite complex pattern) by using tregex, which behaves rather like java.util.regex by allowing pattern matches over trees. You would have something like:
```
TregexPattern tgrepPattern = TregexPattern.compile("PRP");
TregexMatcher m = tgrepPattern.matcher(t);
while (m.find()) {
    Tree subtree = m.getMatch();
    pronouns.add(subtree);
}
```

0人赞添加讨论(0) 举报

Get certain nodes out of a Parse Tree

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间