In Stanford Dependency Manual they mention "Stanford typed dependencies" and particularly the type "neg" - negation modifier. It is also available when using Stanford enhanced++ parser using the website. for example, the sentence:
"Barack Obama was not born in Hawaii"
The parser indeed find neg(born,not)
but when I'm using the stanfordnlp
python library, the only dependency parser I can get will parse the sentence as follow:
('Barack', '5', 'nsubj:pass')
('Obama', '1', 'flat')
('was', '5', 'aux:pass')
('not', '5', 'advmod')
('born', '0', 'root')
('in', '7', 'case')
('Hawaii', '5', 'obl')
and the code that generates it:
import stanfordnlp
stanfordnlp.download('en')
nlp = stanfordnlp.Pipeline()
doc = nlp("Barack Obama was not born in Hawaii")
a = doc.sentences[0]
a.print_dependencies()
Is there a way to get similar results to the enhanced dependency parser or any other Stanford parser that result in typed dependencies that will give me the negation modifier?
This outputs the following for the first sentence:
It is to note the python library stanfordnlp is not just a python wrapper for StanfordCoreNLP.
1. Difference StanfordNLP / CoreNLP
As said on the stanfordnlp Github repo:
Stanfordnlp contains a new set of neural networks models, trained on the CONLL 2018 shared task. The online parser is based on the CoreNLP 3.9.2 java library. Those are two different pipelines and sets of models, as explained here.
Your code only accesses their neural pipeline trained on CONLL 2018 data. This explains the differences you saw compared to the online version. Those are basically two different models.
What adds to the confusion I believe is that both repositories belong to the user named stanfordnlp (which is the team name). Don't be fooled between the java stanfordnlp/CoreNLP and the python stanfordnlp/stanfordnlp.
Concerning your 'neg' issue, it seems that in the python libabry stanfordnlp, they decided to consider the negation with an 'advmod' annotation altogether. At least that is what I ran into for a few example sentences.
2. Using CoreNLP via stanfordnlp package
However, you can still get access to the CoreNLP through the stanfordnlp package. It requires a few more steps, though. Citing the Github repo,
Once that is done, you can start a client, with code that can be found in the demo :
Your sentence will therefore be parsed if you specify the 'depparse' annotator (as well as the prerequisite annotators tokenize, ssplit, and pos). Reading the demo, it feels that we can only access basicDependencies. I have not managed to make Enhanced++ dependencies work via stanfordnlp.
But the negations will still appear if you use basicDependencies !
Here is the output I obtained using stanfordnlp and your example sentence. It is a DependencyGraph object, not pretty, but it is unfortunately always the case when we use the very deep CoreNLP tools. You will see that between nodes 4 and 5 ('not' and 'born'), there is and edge 'neg'.
2. Using CoreNLP via NLTK package
I will not go into details on this one, but there is also a solution to access the CoreNLP server via the NLTK library , if all else fails. It does output the negations, but requires a little more work to start the servers. Details on this page
EDIT
I figured I could also share with you the code to get the DependencyGraph into a nice list of 'dependency, argument1, argument2' in a shape similar to what stanfordnlp outputs.
It ouputs the following
I believe there is likely a discrepancy between the model which was used to generate dependencies for documentation and the one that is available online hence the difference. I would raise the issue with
stanfordnlp
library maintainers directly via GitHub issues.