NLG (Natural Language Generation), a subfield of Artificial Intelligence, is a hot topic in the technology news today. We hear a lot about AI that can soon replace writers and journalists beginning the era of machine creativity. But, what’s all this fuss about? In this article, we unveil what NLG really is and show that it can bring a lot of benefits to businesses and consumers.
In a nutshell, NLG is a sub-field of NLP (Natural Language Processing) that studies methods of automatic transformation of structured data into a human-readable text. In practice, there are two major types of NLG applications: template-based NLG and advanced NLG.
Template-based NLG is the simplest solution that uses templates with canned text and placeholders to insert data into them. Such systems heavily rely on hard-coded rules, which makes them less flexible than advanced NLG. Since template-based NLG tools have a limited number of templates and require special data representations, they can not be easily reused across different projects and business use cases.
Advanced NLG tools are more flexible thanks to the use of supervised and unsupervised Machine Learning (ML). Rather than tying down structured data to the Procrustean bed of templates, advanced NLG uses neural networks that learn morphological, lexical, and grammar patterns from large corpora of written language. Soft probabilistic methods used in the advanced NLG algorithms allow predicting the likelihood of one word appearing after another, and correcting language errors, such as misspellings. ML algorithms used in the advanced NLG are also better in dealing with new words and expressions not included in the original training samples.
Modern NLG service providers such as Narrative Science and Automated Insights prefer advanced NLG methods because they allow creating rich data-driven models that produce intelligent insights from data. These algorithms are much more skillful in making right word choices and writing narratives that reflect intentions and business needs of the NLG users. As an added bonus, advanced NLG models can preprocess and analyze data which makes them not just translators of structured data into text, but automatic analysts able to provide actionable insights.
Despite the fact that NLG methods have been used since the 1970s, they got a powerful momentum only recently and thanks to the AI/ML revolution. Today, many startups offer cloud-based NLG services to businesses. NLG is also gaining traction in mass media and journalism. Major American newspapers are already experimenting with the automatic storytelling. For example, in 2016 the Washington Post unveiled its automatic storytelling AI named Heliograph AI. Heliograph was used in the coverage of Rio Olympics and the US Presidential election in 2016.
Leveraging data mining techniques and ML models the machine reporter can convert structured statistical data, diagrams, graphs, weather forecasts and other data-rich content into excellent descriptive reports that sound if though they were written by the professional reporters. But, isn’t this dangerous for journalism as a profession? Proponents of automatic storytellers say that they actually free up much time for reporters to add analysis and real insights to stories rather than spending countless hours publishing news and descriptive reports[i].
NLG tools may be used in other innovative ways as well:
Benefits of NLG, however, go beyond journalism. There is a growing demand for NLG services among major companies. For example, Quill, an NLG system developed by Narrative Science, is used by such companies as Deloitte, Groupon, and Credit Suisse[iv]. These companies opt for NLG solutions for a reason.
Growing acceptance of NLG among businesses makes it a promising field to study. If you want to learn more about NLG, Byte Academy offers a Natural Language course that covers Natural Language Processing and Natural Language Generation.
Interested in learning more about Data Science? Checkout our immersive and intro courses.
References:
[i] WashPost PR Blog (August 5, 2016). The Washington Post Experiments With Automated Storytelling to Help Power 2016 Rio Olympics Coverage. WashPost PR Blog. Retrieved from https://www.washingtonpost.com/pr/wp/2016/08/05/the-washington-post-experiments-with-automated-storytelling-to-help-power-2016-rio-olympics-coverage/?utm_term=.bf63b03c4aeb
[ii] Dayan, Zohar (2015). Hearst, USA Today Sports, & Viralnova Partner With Wibbitz For Video Strategy. Wibbitz Blog. Retrieved from http://blog.wibbitz.com/wibbitz-partners-hearst-usa-today-sports-group-and-viralnova-to-expand-video-strategy
[iii] Keohane, Joe (2017). What News-Writing Bots Mean for the Future of Journalism. Wired. Retrieved from https://www.wired.com/2017/02/robots-wrote-this-story/
[iv] Narrative Science. Turn Your Data Into Better Decisions With Quill. Retrieved from