When utilizing translation applications, a desirable goal is to achieve translation in real-time. The prevalence of Internet of Things (IoT) devices, many with no traditional input/output hardware, have popularized the use of various applications that utilize audio as an input. After all, utilizing applications that include speech recognition and can accept voice input can reduce the effort users spend on data entry and provide accessibility to users for whom, and users in environments where, manual entry is challenging. For audio entry to be efficient and useful, the workflows must be intuitive and user-friendly. In the case of a translation application utilizing audio as input, as a speaker provides content, one or more programs executing on a processor would provide the translation, in real-time. In order to provide this functionality, the one or more programs would recognize the language of the content to be translated. Existing translation systems are not suited to this workflow because to recognize an input language, these systems require pre-configurations which supply the language of the content, in order to translate the content into a second language. The pre-configurations are required because of the inherent difficulties in recognizing a language of content in real-time (e.g., there are six thousand nine hundred and nine (6,909) distinct languages in the world and the content provided may not be limited to a single language). Provided that the language of content can be recognized, another challenge is determining, on-the-fly, into what language the content should be translated. For example, an intended audience for the translation may include individuals (users, applications, etc.) with different priorities regarding the language of the translated content. Thus, the translation desired may not be apparent if not known (or configured) in advance.