Identifying the context of a word in a sentence

Question

Identifying the context of a word in a sentence

I have created a classifier for the classroom class of nouns, adjectives, Named Entities in this sentence. I have used Wikipedia's large dataset for classification.

Like:

Where was Abraham Lincoln born?

So the classifier will give a result like this - word - class

Where is the question
Abraham Lincoln - man, film, book (because the classifier find Abraham Lincoln in all categories)
born - time

When was Titanic released?

when is the question
Titanic - Song, Movie, Car, Game (Titanic is classified in all of these categories).

Is there a way to define the exact context for a word?

Please look:

The meaning of the sense of the word would not help here. Because there may not be many words in a sentence that can help
Lesk's algorithm with wordnet or sysnet doesn't help either. Because this means that the word Bank

lesk algo will behave like this

======== TESTING simple_lesk ===========

TESTING simple_lesk () ...

Context: I went to the bank to deposit money

Meaning: Synset ('depository_financial_institution.n.01')

Definition: A financial institution that accepts deposits and channels money into lending activities.

TESTING simple_lesk () with POS ...

Context: the riverbank was full of dead fish

Sense: Synset ('bank.n.01')

Definition: Sloping ground (especially slope around a body of water)

Here Bank

it is proposed for the word as financial institute

and slopping land

. Although in my case I already get a prediction like Titanic

, then it could be movie

or game

.

I want to know whether there is a different approach, except Lesk algo

, baseline algo

, traditional word sense disambiguation

that can help me determine which class is right for a particular keyword?

Titanic -

+3

semantics nlp nltk data-mining

user123 14 nov. '14 at 15:00

source to share

1 answer

alvas · Answer 1 · 2014-11-14T22:12:40+0000

Thanks for using the pywsd

examples . As far as wsd goes, there are many other options and I code them myself in my spare time. So if you want this to improve, please join me on coding an open source tool =)

At the same time, you will find the following technologies more important to your task, such as:

Knowledge base population ( http://www.nist.gov/tac/2014/KBP/ ) where tokens / text segments are assigned an object and the challenge is to link them or solve a simplified problem and answer.
Knowledge view ( http://groups.csail.mit.edu/medg/ftp/psz/k-rep.html )
Knowledge Extraction ( https://en.wikipedia.org/wiki/Knowledge_extraction )

The above technologies usually include several subtasks, for example:

Wikification ( http://nlp.cs.rpi.edu/kbp/2014/elreading.html )
Linking objects
Slot filling ( http://surdeanu.info/kbp2014/def.php )

Basically, you are asking for a tool that is an NP-complete AI system for text and text processing, so I don't think such a tool exists yet. Perhaps it is IBM Watson.

if you're looking for a search box, the box is there, but if you're looking at tools, chances are the wikification tools are closest to what you might need. ( http://nlp.cs.rpi.edu/paper/WikificationProposal.pdf )

Identifying the context of a word in a sentence

TESTING simple_lesk () ...

TESTING simple_lesk () with POS ...

More articles: