Text data can be stored in many different encodings. These may differ due to locale, for example English versus Turkish, due to the default settings for a particular architecture, for example ASCII for a Windows system versus an EBCDIC (Extended Binary Coded Decimal Interchange Code) variant for a mainframe, or due to particular decisions about how to store it. For transferring the data from a host system to a client system by a method such as the FTP protocol, a choice is either transferring the data in a binary form or in a text form. In the binary form, the data is left unmodified during the transfer. In the text form, the host makes some assumptions about the initial encoding of the data, based on file metadata or content, the host's default settings or some other information, and the encoding of the client, and the host converts the data from that initial encoding to the client encoding. However, many of these assumptions can be wrong; for example, a mainframe may assume that a file is encoded as IBM-037 (an EBCDIC code), even through it is actually ASCII. With the wrong assumption, the encoding conversion will result in garbage being transferred, and consequently the client is unable to read it. Therefore, the choice between transferring as binary or text is an important one, and the choice can lead to manual trial-and-error in order to transfer correctly.
One way to reduce the manual effort may be as follows. Step 1: transfer as text and then have the client attempt to use a decoder to decode the transferred data. If the decoding passes, then use that data and leave the algorithm at this point. But, the decoding can often result in the decoding process giving errors and the data does not map into recognized characters; this is an indication that the transfer is incorrect. Step 2: if transferring as text is incorrect, fall back to transferring as binary and use binary data. However, this method can fail when using some decoders. For example, the decoders of some operating systems and some encodings are weaker at validating than in others, so that sometimes decoding at step 1 succeeds although the data has not been transferred in the correct mode.