Natural Language Generation

Published: 01 Jan 2016 Category: Notes

Natural Language Generation

Natural language generation (NLG) is the natural language processing task of generating natural language from a machine representation system such as a knowledge base or a logical form. Psycholinguists prefer the term language production when such formal representations are interpreted as models for mental representations.

NLG may be viewed as the opposite of natural language understanding: whereas in natural language understanding the system needs to disambiguate the input sentence to produce the machine representation language, in NLG the system needs to make decisions about how to put a concept into words.

The process to generate text can be as simple as keeping a list of canned text that is copied and pasted, possibly linked with some glue text. The results may be satisfactory in simple domains such as horoscope machines or generators of personalised business letters. However, a sophisticated NLG system needs to include stages of planning and merging of information to enable the generation of text that looks natural and does not become repetitive. The typical stages of natural language generation, as proposed by Dale and Reiter are:

Content determination: Deciding what information to mention in the text.
Document structuring: Overall organisation of the information to convey.
Aggregation: Merging of similar sentences to improve readability and naturalness.
Lexical choice: Putting words to the concepts.
Referring expression generation: Creating referring expressions that identify objects and regions.
Realisation: Creating the actual text, which should be correct according to the rules of syntax, morphology, and orthography.

There are three basic techniques for evaluating NLG systems:

Task-based (extrinsic) evaluation: give the generated text to a person, and assess how well it helps him perform a task (or otherwise achieves its communicative goal).
Human ratings: give the generated text to a person, and ask him or her to rate the quality and usefulness of the text.
Metrics: compare generated texts to texts written by people from the same input data, using an automatic metric such as BLEU.

Summer School on Natural Language Generation, Summarisation, and Dialogue Systems

http://nlgsummer.github.io/

Introduction to NLG: Ehud Reiter (University of Aberdeen, ARRIA NLG)
http://nlgsummer.github.io/slides/Ehud_Reiter-Intro_to_NLG.pdf
Content Determination: Gerard Casamayor (Universitat Pompeu Fabra)
http://nlgsummer.github.io/slides/Gerard_Casamayor-Content_Determination.pdf
Micro-planning: Albert Gatt (University of Malta)
http://nlgsummer.github.io/slides/Albert_Gatt-Microplanning.pdf
Surface Realisation: Albert Gatt (University of Malta)
http://nlgsummer.github.io/slides/Albert_Gatt-Realisation.pdf
Introduction to Summarisation: Advaith Siddharthan
http://nlgsummer.github.io/slides/Advaith_Siddharthan-Introduction_to_Summarisation.pdf
Introduction to Dialogue Systems: Paul Piwek (Open University)
http://nlgsummer.github.io/slides/Paul_Piwek-Introduction_to_Dialogue_Systems.pdf
Learning to generate: Concept-to-text generation using machine learning Yannis Konstas (University of Edinburgh)
http://nlgsummer.github.io/slides/Ioannis_Konstas-Learning_to_generate.pdf
Evaluation: Ehud Reiter (University of Aberdeen, ARRIA NLG)
http://nlgsummer.github.io/slides/Ehud_Reiter-NLG_evaluation.pdf
Readability: Thomas François (Université catholique de Louvain)
http://nlgsummer.github.io/slides/Thomas_Francois-Readability.pdf
Cognitive Modelling: the case of reference Kees van Deemter (University of Aberdeen)
http://nlgsummer.github.io/slides/Kees_van_Deemter-Cognitive_Modelling.pdf
The New Science of Information Delivery: Robert Dale (ARRIA NLG)
http://nlgsummer.github.io/slides/Robert_Dale-The_Science_of_Information_Delivery.pdf

相关连接

https://en.wikipedia.org/wiki/Naturallanguagegeneration