1. Field of the Invention
The present invention relates to a method for transmitting arbitrarily large amounts of data over the Internet from a client machine to a server using only the HTTP GET method, such that data in excess of traditional URL length maximums enforced by web browsers and servers can be transmitted without using the HTTP POST method.
2. Description of Related Art
Two of the primary methods by which a client communicates with a server across the Internet using the Hypertext Transfer Protocol (HTTP) are the GET and POST methods. Conventionally, the GET method was intended to act as a simple request made by a client, such as a web browser, to a server for a resource, such as an HTML document. A request made by the GET method is conventionally idempotent and simply requests a resource from a server rather than submitting data or altering any data on the server. Methods have the property of “idempotence” if the result of multiple, identical requests is the same as for a single request. The POST method was intended to provide a way for a client, such as a web browser, to submit information to a web server across a network, such as the Internet. Conventionally, the POST method is used for requests that are not idempotent and alter data on the server in some way. Although the GET method is traditionally used for idempotent requests, it may also be used for requests that store or modify data on the server, using the GET method for idempotent requests is strictly a convention and not a technical specification. Note that these methods are also used by Hypertext Transfer Protocol Secure (HTTPS). Although the http: and https: schemes use different underlying connections, they both use the HTTP protocol.
In HTML, GET and POST are the two possible options for how the contents of a form submission are sent to a server. The chosen method is designated in the method attribute of the <form> tag. Whichever method is designated within the HTML <form> tag determines whether the client's web browser will send the data to the server using an HTTP GET or POST request. There are important technical differences between these two methods. When using the GET method, all of the form submission data is combined with the URL of the desired page to create a URL that contains both the path of the desired page and the name-value associations of the form contents. When using the POST method all of the form submission data is included as the body of the request. Various reasons and conventions exist for choosing one method over another. A simple request that does not alter the data on the server would conventionally use the GET method. A common example of this is using a search engine. Typically the search value that the user enters is sent to the server using the GET method and is present in the URL of the resulting page. This is because the request does not alter any of the data on the server; it simply determines what portion of data contained on the server will be returned to the client in order to be displayed. The POST method is instead used when the request may affect the data on the server rather than simply retrieving it. Common examples of this include sending an email, logging in to a system, and placing an order. Although the GET method is traditionally used by HTML forms for idempotent requests, it may also be used for requests that store or modify data on the server.
In addition to the conventional differences and the technical differences between the GET and POST methods, there is one extremely important difference in practice. Although the HTTP specifications do not place any restrictions on the amount of data transmitted by either GET or POST, some web servers and web browsers enforce a maximum URL length. Even though this limitation is not part of the specification for HTTP it has greatly impacted the Internet community because the most widely used and prominent web browsers enforce such a limit. For example, Microsoft's Internet Explorer, which has been the predominant web browser, enforces a maximum URL of 2,083 characters. The implications of this limit are that because the GET method transmits data to the server by encoding it in the URL less than 2,083 bytes of data can be transmitted. Because the POST method transmits data as the body of the request, it does not suffer from any limitation, artificial or otherwise. As a matter of practice, then, the POST method is the only option available whenever a client needed to send an arbitrarily large amount of data to a server in conventional systems.
One example of an application that makes use of HTML forms is an application designed to send encrypted emails. A mechanism for doing this is to send an HTML email that is a form asking for the password to allow the message to be decrypted, and to embed in this email the payload that is the encrypted message as a form element (typically hidden). Then, when the user presses the Submit button after entering the password, a JavaScript routine, also embedded in the message, intercepts the form submission, decrypts the hidden payload, and displays the decrypted message.
This application has the limitation that it relies on JavaScript in order to decrypt and display the message. It is possible for a particular user to have disabled JavaScript or to be using a web browser or other HTML viewer that does not support JavaScript. If this is the case then the application as described will not work. Moreover, some mail clients, notably Yahoo! Mail, either strip an email of JavaScript or modify the script so it is no longer functional. In these cases a solution that does not depend on JavaScript is required.
In order to work around these issues, the described application can implement its HTML form such that if JavaScript does not work, then it will simply function like a standard HTML form and send the form data to a designated URL, in this case a web server. The encrypted message that is the hidden payload may be arbitrarily large, so the POST method must be used. Thus, the encrypted message in the form of a hidden input field, the user's password or other form of credentials, and perhaps some other form data are sent to the server by the POST method. The server can then check the credentials, decrypt the message, and return the correctly decrypted message across a secure Internet connection to the user. This method works and securely decrypts the message in instances where JavaScript is not available.
The above described application is effective in most situations, but some mail clients, notably Microsoft's Outlook Web Access, remove all large form fields from an email. Because the encrypted payload is encoded as a form field and is typically large, the encrypted content is removed by such mail clients. Once the encrypted content has been removed from the message there is no way to retrieve it. The described application thus does not work with such mail clients.
In order to work with mail clients that remove all large form fields, the described application must encode the encrypted payload in another manner. One way to do this is to use many small fields instead of one large field. Then the server can aggregate all of the individual fields to recreate the payload, and then decrypt and return the message. As long as all the fields were small enough to not be removed by the mail client this method would be adequate. However, this solution is made ineffective by some mail clients, notably Microsoft's Outlook Web Access, that change all forms inside an email message to use the GET method instead of POST. As discussed above, this effectively limits the amount of data that can be transmitted from the client's computer to the server through the form contained in the email. Even if small fields are used, all data is encoded into a single URL that would in many cases surpass the size limitations enforced by some servers and web browsers. Regardless of how the form is set up, there is currently no way to guarantee an ability to send an arbitrary amount of data to a server using the GET method. Moreover, there is currently no way for a single form to send several GET requests to the server, and so the data cannot be segmented in that manner.