Using NLTK's StanfordParser, I can parse a sentence like this:
os.environ['STANFORD_PARSER'] = 'C:\jars'
os.environ['STANFORD_MODELS'] = 'C:\jars'
os.environ['JAVAHOME'] ='C:\ProgramData\Oracle\Java\javapath'
parser = stanford.StanfordParser(model_path="C:\jars\englishPCFG.ser.gz")
sentences = parser.parse(("bring me a red ball",))
for sentence in sentences:
sentence
The result is:
Tree('ROOT', [Tree('S', [Tree('VP', [Tree('VB', ['Bring']),
Tree('NP', [Tree('DT', ['a']), Tree('NN', ['red'])]), Tree('NP',
[Tree('NN', ['ball'])])]), Tree('.', ['.'])])])
How can I use the Stanford parser to get typed dependencies in addition to the above graph? Something like:
- root(ROOT-0, bring-1)
- iobj(bring-1, me-2)
- det(ball-5, a-3)
- amod(ball-5, red-4)
- dobj(bring-1, ball-5)
NLTK's StanfordParser module doesn't (currently) wrap the tree to Stanford Dependencies conversion code. You can use my library PyStanfordDependencies, which wraps the dependency converter.
If
nltk_tree
issentence
from the question's code snippet, then this works: