Extract State-Based Subtrees from StanfordCoreNLP Analysis Trees
I am using Stanford CoreNLP to get parsing trees and have to extract all nodes and its children for a specific POS speech tag. For example, remove all S tags that are children of SBAR, create a separate subtree for each S tag, remove that S and all children of S. I could extract the S tags in SBAR, but not sure how to prune the original tree to remove S tags, which are children of SBAR. I think I need to use filter and draft but not sure how to use them conditionally. The code I have will extract all S tags, but I want to remove S if it is a child of SBAR and if it has been added to my tree list.
There is also a more efficient way to do this.
public class test {
public static void main(String[] args) {
String Text= new String("I think that he wants a big Car");
StanfordCoreNLP pipeline = getPipeline();
getAnnotation(pipeline, Text);
}
private static StanfordCoreNLP getPipeline(){
// creates a StanfordCoreNLP object, with POS tagging, lemmatization, parsing
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, parse");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
return pipeline;
}
private static void getAnnotation(StanfordCoreNLP pipeline, String sentence ){
Annotation document = new Annotation(sentence);
pipeline.annotate(document);
List<CoreMap> annotations = document.get(SentencesAnnotation.class);
ArrayList<Tree> list_trees = new ArrayList<Tree>();
Tree tree=null;
Tree PrunedTree ;
Filter<Tree> f = new Filter<Tree>() { public boolean accept(Tree t) { return !
t.label().value().equals("S"); } };
for(CoreMap annotation: annotations) {
// this is the parse tree of the current sentence
tree = annotation.get(TreeAnnotation.class);
for (Tree subtree : tree) {
if (subtree.label().value().equals("SBAR")) {
for (Tree sbartrees : subtree)
{
if (sbartrees.label().value().equals("S"))
list_trees.add(sbartrees);
}
}
}
}
PrunedTree=tree.prune(f);
}
}
+3
source to share
No one has answered this question yet
Check out similar questions: