Problems that cross certain chunks in NLTK3

Question

Problems that cross certain chunks in NLTK3

Hi I'm trying to use this code in NLTK3: - Somehow I managed to fix line-6 to work with NLTK version 3. But the stil for loop returns nothing.

import nltk
sample = """ some random text content with names and countries etc"""     
sentences = nltk.sent_tokenize(sample)
tokenized_sentences = [nltk.word_tokenize(sentence) for sentence in sentences]
tagged_sentences = [nltk.pos_tag(sentence) for sentence in tokenized_sentences]
chunked_sentences=nltk.chunk.ne_chunk_sents(tagged_sentences) #Managed to fix this to work with version_3

for i in chunked_sentences:
    if hasattr(i,'label'):
        if i.label()=='NE':
            print i

Also if I try to debug I see this output:

for i in chunked_sentences:
    if hasattr(i,'label') and i.label:
        print i.label
S
S
S
S
S
S
S
S

Then how can I test it for "NE". There is something wrong with NLTK-3 that Im really not able to figure out. Help

+3

python nltk named-entity-recognition

rzach 30 nov. 14 at 10:02

source to share

1 answer

Michael haas · Accepted Answer · 2014-11-30T10:41:59+0000

You seem to be repeating sentences. I am assuming that you want to iterate over the individual nodes contained in the sentences.

It should work like this:

for sentence in chunked_sentences:
    for token in sentence: 
       if hasattr(token,'label') and token.label() == 'NE':
           print token

Edit: For future reference, what prompted me that you are iterating over sentences is simply that the root node for a sentence is usually marked "S".

Problems that cross certain chunks in NLTK3

More articles: