Natural language generation (NLG) is sometimes referred to as a subfield of artificial intelligence and computational linguistics that focuses on the production of understandable texts in English or other understandable language. In some examples, a natural language generation (NLG) system is configured to transform raw input data that is expressed in a non-linguistic format into a format that can be expressed linguistically, such as through the use of natural language (e.g., the conversion from data to text). In some cases the data is high frequency numerical data. For example, raw input data may take the form of a value of a stock market index over time and, as such, the raw input data may include data that is suggestive of a time, a duration, a value and/or the like. Other examples, may include the generation of textual weather forecasts base on numerical weather prediction data. Therefore, an NLG system may be configured to input the raw input data and output text that linguistically describes the value of the stock market index; for example, “securities markets rose steadily through most of the morning, before sliding downhill late in the day.” Importantly, for use in an NLG system, data must be analysed and interpreted in a way in which the analysis and interpretation can be linguistically communicated. For example, data that indicates the price of a stock market rising may be represent linguistically as rising, spiking or the like. A human may then make decisions based on how that human interprets rising versus spiking.
Data that is input into a NLG system may be provided in, for example, a recurrent formal structure. The recurrent formal structure may comprise a plurality of individual fields and defined relationships between the plurality of individual fields. For example, the input data may be contained in a spreadsheet or database, presented in a tabulated log message or other defined structure, encoded in a ‘knowledge representation’ such as the resource description framework (RDF) triples that make up the Semantic Web and/or the like. In some examples, the data may include numerical content, symbolic content or the like. Symbolic content may include, but is not limited to, alphanumeric and other non-numeric character sequences in any character encoding, used to represent arbitrary elements of information. In some examples, the output of the NLG system is text in a natural language (e.g. English, Japanese or Swahili), but may also be in the form of synthesized speech.