This disclosure relates to imaging devices, in particular, to automatic language switching in imaging devices.
Some imaging devices may only support a single language format. For example a printer may only support a raster data format that is specific to the printer's printing engine. In such a device, the imaging data is sent without prior checking to the print data interpreter. If the input is in the correct language format, the input is recognized and the job is output; otherwise, some or all of the input is not recognized and the job is not output, or outputted incorrectly.
Other imaging devices may support multiple language formats. However, an explicit language switch that precedes the imaging data may be required. A command or other data may be received by the imaging device that includes a known or predetermined format specifying the language format of the subsequent imaging data. For example, many imaging devices support explicit language switching by supporting Hewlett Packard's printer job language (PJL) command @PJL ENTER LANGUAGE=<language>. If the specified language is supported by the device, the imaging data is then processed by the corresponding interpreter; otherwise, the input is rejected and the job is not output.
Unfortunately, generators of imaging data that do not use an explicit language switch may not be compatible with imaging devices that require explicit language switching.
Alternatively, when an imaging device receives the imaging data without an explicit indication of the language format, the imaging device must sample the imaging data to determine the language format. This process is commonly referred to as automatic language switching.
In one method of automatic language switching, an initial byte sample of the input stream of the imaging data is pre-read and passed to a language recognition process. For each supported language, there is a language specific recognition test, such as looking for a special signature in the byte sample. The method applies each language specific recognition test in a predetermined sequential order until one test acknowledges recognition of the language format. (If none do, an error is generated.) When the language is recognized, the input stream is then processed by the corresponding language interpreter.
Unfortunately, with each set of imaging data, the recognition tests are performed in the same order. Thus, for a given language format with a recognition test at the end of the list, every preceding recognition test must be performed each time imaging data in that language format is received, wasting time and resources.
In another method, an initial byte sample of the imaging data is pre-read and compared against a group of language recognizers with one language recognizer per supported language. Each language recognizer generates a probability that the language of the imaging data is the associated language format. The language format with the highest probability is used.
Unfortunately, by using probabilities to select a language, the result is only a guess of the correct language, not a specific determination, and may be an incorrect selection. Furthermore, the test for each language must be performed to determine its associated probability.
In another method, an initial byte sample of the input stream is pre-read and passed to a language recognition process. The language recognition process uses a dynamically determined sequential order of language recognizers. The order of language recognizers is updated in response to how often the associated language format is found.
Unfortunately, each language recognizer reapplies its entire signature test. Thus, even if two language formats have identical initial portions of their respective signatures, a test for the identical portion may be performed multiple times.
Accordingly, there remains a need for an improved method and apparatus for determining a language format of data.