This invention relates to the field of network file transfer. In particular, the invention relates to automatic determining of file transfer mode.
File Transfer Protocol (FTP) is a standard network protocol used to transfer files from one host to another host over a Transmission Control Protocol (TCP) based network, such as the Internet. FTP is built on a client-server architecture and utilizes separate control and data connections between the client and server.
FTP clients allow transfer of data in two modes—binary and text. The binary mode transfers bytes in their raw form. The text mode assumes the data contains text characters and performs any required conversion during puts and gets as specified in the configuration of the FTP server.
An example of this is used by the operating system z/OS (z/OS is a trademark of International Business Machines Corporation), where text data is typically stored on disk using the Extended Binary Coded Decimal Interchange Code (EBCDIC) character set. Clients, such as personal computers, work with code pages based on the user's locale, such as American Standard Code for Information Interchange (ASCII) data, so when doing an ftp put with the text mode, the conversion from the client codepage, such as ASCII, to server EBCDIC is performed by the FTP server, and vice-versa when doing an ftp get. Files such as source code are typically handled as text, so clients author and store the files as ASCII, z/OS as EBCDIC, and transfers are performed with text conversion taking place across the wire. The default code page to use for conversion is typically a configuration setting on the server, and can be overridden by each client prior to a text transfer taking place.
There are other files which contain text which need to be stored in binary on the server. An example of this is Extended Markup Language (XML) files which have a UTF-8 encoding and need to be processed by server side Java (Java is a trademark of Sun Microsystems, Inc.) programs which have been written to read UTF-8 text. An example used by the CICS (Customer Information Control System, CICS is a trademark of International Business Machines Corporation) transaction server is the cics.xml files that are part of a CICS bundle. These kinds of file need to be transferred between client and server in binary mode, so that no character conversion takes place and all characters remain in their raw bytes.
Known solutions to transfer files back and forth between client and server using the correct format may involve one or more of the following:                Having advance knowledge of the file type to be used based on the scenario in which it is being used;        Associating a file extension with a particular type (e.g. FileZilla, and most common ftp clients);        Letting the user specify which transfer type to use (e.g. Rational Developer for System z).        
The disadvantages of these solutions are:                Advance knowledge of the file type based on the scenario does not work for a generic solution where the user wants to browse server files and select one to edit based on its path, rather than its usage.        Having to associate a file extension is not always sufficient as the same file extension can be text sometimes, and binary at other times. For example, cics.xml in a CICS bundle is binary, whereas atomservicedefinition.xml is text.        Letting the user specify which transfer type to use means that there is more room for error—the user can make a mistake and corrupt their data—and the user must have knowledge of file type and their usage, meaning the skill level required to use the client software is higher.        
There may also arise situations in which the usage of servers has grown from one locale to multiple locales, such as when a business that was US English only has expanded to include customers with other languages and therefore non-ASCII codepage documents. A hybrid mix of documents may be found on a file system where older ones, or ones used by older systems, require storage in the original codepage, in this example ASCII, whereas documents which are stored by users in other locales are held in their respective code pages, and documents which must be accessed by users from more than one locale are stored in a neutral format such as UTF-8.
Therefore, there is a need in the art to address the aforementioned problems.