How can I find the grammatical relationships of a nominal phrase using Stanford Parser or Stanford CoreNLP
I am using stanford CoreNLP to try and find the grammatical relationships of nominative phrases.
Here's an example:
Considering the suggestion "Fitness room was dirty".
I was able to identify "Fitness Gym" as my target noun phrase. Now I'm looking for a way to find that the "dirty" adjective has to do with "fitness room" and not just "room".
example code:
private static void doSentenceTest(){
Properties props = new Properties();
props.put("annotators","tokenize, ssplit, pos, lemma, ner, parse, dcoref");
StanfordCoreNLP stanford = new StanfordCoreNLP(props);
TregexPattern npPattern = TregexPattern.compile("@NP");
String text = "The fitness room was dirty.";
// create an empty Annotation just with the given text
Annotation document = new Annotation(text);
// run all Annotators on this text
stanford.annotate(document);
List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);
for (CoreMap sentence : sentences) {
Tree sentenceTree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
TregexMatcher matcher = npPattern.matcher(sentenceTree);
while (matcher.find()) {
//this tree should contain "The fitness room"
Tree nounPhraseTree = matcher.getMatch();
//Question : how do I find that "dirty" has a relationship to the nounPhraseTree
}
// Output dependency tree
TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(sentenceTree);
Collection<TypedDependency> tdl = gs.typedDependenciesCollapsed();
System.out.println("typedDependencies: "+tdl);
}
}
I used Stanford CoreNLP for a suggestion extracted from its rooted tree. On this tree object I was able to extract Noun Phrases using TregexPattern and TregexMatcher. This gives me a child tree containing the actual phrase. What I would like to know is find the modifiers for the noun phrase in the original sentence.
The typedDependecies ouptut gives the following:
typedDependencies: [det(room-3, The-1), nn(room-3, fitness-2), nsubj(dirty-5, room-3), cop(dirty-5, was-4), root(ROOT-0, dirty-5)]
where i can see nsubj (dirty-5, room-3) but i don't have the full phrase as dominator.
I hope I'm clear enough. Any help was appreciated.
source to share
The typed do dependencies show that the adjective "dirty" refers to "gym":
det(room-3, The-1) nn(room-3, fitness-2) nsubj(dirty-5, room-3) cop(dirty-5, was-4) root(ROOT-0, dirty-5)
the 'nn' tag is a noun compound modifier , indicating that "suitability" is a "room" modifier.
Details on dependency tags can be found in the Stanford Operations Guide .
source to share