NLG - creating text descriptions with simpleenlg

I am trying to generate product descriptions using NLG. For example, if I list the properties of a product (like a mobile phone) like its OS, RAM, processor, display, battery, etc., it should output me a readable description of the mobile phone. I see there are some paid services (Quill, Wordsmith, etc.) that do the same thing. Then I came across an open source Java API for NLG - simplenlg . I can see how to create sentences by specifying sentence phrases and features (like time, poll, etc.), but can't see the ability to create description from texts.

Does anyone know how to create a text description from words using simpleenlg?

Are there any other tools / frameworks / APIs to accomplish this task (not limited to Java)?

+3


source to share


1 answer


SimpleNLG is, first of all, Surface Realizer . It requires well-formatted input, but can then perform tasks such as changing the bid time. An explanation of the types of tasks that the implementer can perform can be found in the link above.

Generating a proposal like the one you are describing will require additional components to handle document planning and micro-planning. The exact boundaries between these components are blurry, but in a broad sense, you will define what you want to say in terms of the document, and then perform the microplanner task, for example, transmit the expressing generation (choosing whether to say "this" rather than "mobile phone" ') and aggregation, which is a merge of sentences. SimpleNLG has some support for aggregation.

It's also worth noting that this 3-step process isn't the only way to complete NLG, it's just normal.

There is no magic solution I know to take some information from a random domain and create readable and meaningful text. In your mobile phone example, it would be trivial to combine descriptions and create something like:

The iPhone 7 has iOS11, 2GB of RAM, a 1960mAh Li-ion battery, and a retail price of $ 64 for the 32GB model.



But it will just be simple string concatenation or interpolation from your data. It doesn't take into account nuances, like the question of whether it would be better to say:

iPhone 7 runs iOS11, has 2 GB of RAM, and by the battery is Li-ion 1960 mAh. It costs $ 649 at retail for the 32GB model.

In this second example, I adjusted the verbs (and therefore the noun phrases), used the reference expression 'it', and split our long sentence in two (with some further changes due to separation). Making these changes requires knowledge (and therefore computational rules) of words and their use in the domain. It gets nontrivial very quickly.

If your requirements are as simple as 5 or 6 pieces of phone information, you could probably do it pretty well without the NLG software, just create some kind of template and make sure all your data makes sense when pasting in. Once you go beyond mobile phones, however, when describing cars, you will need to do the same job again for a new domain.

It would be helpful to have a look at the blog by Ehud Reiter (original author of SimpleNLG). There are also papers such as Albert Gatt (A Survey of the State of the Art in Natural Conditions "Building a Language: Basic Tasks, Applications, and Evaluation), although the latter is a bit dense if you're only doing a little programming, it does provide an overview of what NLG is. what it can do and what its current limitations are.

+2


source







All Articles