Determine if a word is your subject or an object pronoun based on the context of the sentence.

It is ideal to use regex, in python. I am making a simple chatbot and is currently having trouble answering phrases like "I love you" correctly (it will drop "You love me" from the grammar processor when it has to give "You love me" ").

Also, I would like if you could think of some good phrases to throw into this grammar handler, that would be great. I would love the test data.

If there is a good list of transitive verbs out there (something like "Top 100"), it may be acceptable to use this and the special case of "transitive verb + you".

+3


source to share


2 answers


What you want is a parser (aka parser ) - this can be done with the rule-based system described in @Dr. Cameleon, or statistically. There are many implementations, one of which is Stanford . They will usually tell you what the syntactic role of a word is (for example, the topic "You are here" or the object "She is like you"). How you use this information to turn statements into questions is a completely different possibility for worms. For English, you can get a pretty simple rule-based system to work just fine.



0


source


Well, what you are trying to implement is definitely very difficult, but also very difficult.

Logics

As a starter, I would first look at the grammar rules.

Main sentence structure:

  • SUBJECT + TRANSITIVE VERB + OBJECT
  • SUBJECT + INTRANSITIVE VERB

(Of course, we could also talk about the "Subject + Verb + Indirect Object + Direct Object" formats, etc. (for example, I give you the ball), but now it will be too complicated ...)

Obviously this diagram is VERY simplified, but for now stick with it.

Then (another oversimplified assumption) that each part is one word .

so basically you have the following Sentence schema :

WORD WORD WORD

      

which can usually be matched using a regex like:

([\w]+)\s+([\w]+)\s+([\w]+)?

      



Explanation:

([\w]+)     # first word (=subject)
\s+         # one or more spaces    
([\w]+)     # second word (=verb)
\s+         # one or more spaces
([\w]+)?    # (optional) third word (=object - if the verb is transitive)

      


Now, obviously, to formulate sentences like โ€œYou love meโ€ rather than โ€œYou love me,โ€ your algorithm must also โ€œunderstandโ€ that:

  • The third part of the proposal has the role of Object
  • Since "I" is a personal pronoun (used only in the nominative case: "as subject"), we must use its "accusative form" (= as an object); so for this purpose you may also need , for example personal pronoun tables, for example:
  • I - my - me
  • You - your - you
  • He - his - he
  • etc...

Just a few ideas ... (purely out of my enthusiasm for linguistics :-))


Data

As for the word lists you are interested in, just a few examples:

+3


source







All Articles