The techniques described herein generally relate to web services and more specifically to determining vulnerabilities and compliance of web services.
Today, millions of computers are connected via a heterogeneous network that is often referred to as the World Wide Web (WWW) or the Internet. The well-known Hypertext Markup Language (HTML) standard is often used by computers on the Internet to exchange as well as display information. HTML contains provisions for specifying what information is displayed as well as how the information is presented. However, as more and more users have adopt the Internet to conduct day-to-day operations and use HTML to exchange and display information associated with these operations, the limitations of HTML have become apparent.
While HTML is very good at displaying information associated with consumer-oriented applications, such as shopping carts, it lacks the necessary richness to dynamically describe information in detail and in various formats for application-to-application and machine-to-machine communication. The well-known Extensible Markup Language (XML) provides the capability to richly describe information in a flexible way where content, structure and data format are represented independently for efficient machine readability and exchange. XML is designed to give meaningful structure to data and the ability to dynamically add rules on how the data is to be interpreted by another party.
An XML document is said to be “well-formed” when it abides by constraints defined by properties carried by the document. These constraints may include, for example, (1) having exactly one root element, (2) requiring that every start tag contained in the document has a matching end tag and (3) requiring that no tag within the document overlaps with another tag. XML documents typically start with a declaration of the XML version and the type of encoding being used to encode the document. For example, a document may be encoded with the declaration “<?xml version=‘1.0’ encoding=‘utf-8’?>” which indicates than XML version 1.0 is being used with 8-bit Unicode Transformation Format (utf-8) type encoding.
An XML document may contain one or more elements. In accordance with XML, an element comprises a start tag, an end tag and data contained between the start tag and end tag. For example, an element name “Patent” may be coded in XML as:                <Patent> XML </Patent>Here, “Patent” is a tag (name) associated with the element and the content of the element is the text “XML”. “<Patent>” is the start tag of the element and “</Patent>” is the end tag of the element.        
An attribute in XML comprises a simple name-value pair where the value may be in single or double quotes. For example, consider the following exemplary XML code:                <Patent Type=“Network Security”> XML </Patent>Here, the element “Patent” has a name-value attribute whose name is “Type” and value is “Network Security”.        
When XML documents travel from a sender computer to a receiver computer it is important that both computers have the same expectations about the content. That is, both computers should have the same expectations about the content in terms of the way the sender describes the content so it will be understood by the receiver. Schemas are often used to set out and define these expectations. One commonly used schema for the XML language is the XML Schema.
XML Schema is a well-known World Wide Web Consortium (W3C) recommendation that defines a set of rules to which an XML document must conform in order to be considered “valid”. A receiver expecting XML documents to be sent to it that conform to the XML Schema may use the schema to validate the content of the documents. Even if a document is well-formed it may still contain errors and those errors may cause problems for the receiver. Since XML Schema describes the structure of an XML document it provides an additional check for both sender and receiver to validate the document. XML Schema is also commonly referred to as the XML Schema Definition (XSD).
One of the greatest strength of XSD is its support for data types. This support enables the validation of data as to its correctness, describes permissible data and defines restrictions on the values that an element can take through the use of facets. Strong data-typing and restrictions are essential for building secure applications that are not vulnerable to buffer-overflow and Denial-of-Service (DoS) type attacks.
An application that accepts an XML document typically validates the XML document against an XSD either during or before the document is consumed by the application. XML document validation serves as a filter to ensure that the XML document adheres to a structural and data-type format expected from the consuming application.
The Simple Object Access Protocol (SOAP) is a protocol that may be used to exchange XML based messages using the, e.g., Hypertext Transport protocol (HTTP). A SOAP message comprises an envelope that contains an optional header and a required body. The purpose of the header is to provide information to an application as to how to process the message. Information such as routing, user name tokens and signatures can reside in the header. The body contains the actual message to be delivered to the target application for processing. The body may consist of anything that can be expressed in XML. Additionally, XML and non-XML attachments (e.g., images, Portable Document Format (PDF) files, word processing documents) may be attached to the SOAP message.
SOAP and XML messages are flexible in that they are protocol, hardware, and operating system independent. This flexibility enables applications to readily communicate with one another. Protocol independence here means that SOAP and XML messages may be exchanged between two entities using various protocols. For example, SOAP and XML messages may be exchanged using HTTP, secured HTTP (HTTPS), Java Messaging Service (JMS), File Transfer Protocol (FTP) and the Simple Mail Transfer Protocol (SMTP). Scanning for web services vulnerabilities, compliance exception and exploits typically requires generating SOAP-based and XML-based tests that are focused on the message, regardless of what protocol is being used to communicate with the application.
SOAP messages, their content, the application end-point where they are received and the structure of the response are not specified by the SOAP specification. Moreover, the SOAP specification does not provide a description of the SOAP message exchange with an application.
The Web Services Description Language (WSDL) is a language standard that is typically used to describe web services for an application. WSDL is an XML format for describing network services as end-points that operate on e.g., SOAP or XML messages. The operations and messages associated with the web services are described along with data types of the SOAP messages through an XSD included in the WSDL file. The messages and operations may then be bound to concrete various protocols, such as Hypertext Transport Protocol (HTTP), Secure HTTP (HTTPS), Simple Mail Transport Protocol (SMTP), File Transport Protocol (FTP), Java Message Service (JMS).
Web services describe a standardized way of integrating applications using e.g., the XML, SOAP, WSDL and Universal Description, Discovery and Integration (UDDI) open standards. In a typical arrangement, XML is used to tag data for an application, SOAP is used to transfer the data to the application, WSDL is used to describe services that are available for the application and UDDI is used for listing what described services are available.
The Web Services Interoperability (WS-I) organization is an open industry organization charted to promote web services interoperability across platforms, operating systems and programming languages. The WS-I Basic Profile specification developed by the WS-I organization provides rules and restrictions that may be used to ensure interoperability of SOAP and WSDL across a variety of platforms. The WS-I Basic Profile specification puts a boundary around the WSDL, SOAP and HTTP specification so as to ensure that WS-I Basic Profile compliant WSDL and SOAP messages over HTTP interoperate with one another regardless of the operating system, application, and programming language that may be used.
The WS-I Basic Security Profile specification provides rules and restrictions that may be used to ensure security interoperability of SOAP messaging across a variety of platforms. Further, a number of additional standards such as WS-Security 1.1, Secure Assertion Markup Language (SAML), WS-Policy, WS-Trust, WS-RM, Web Services Distributed Management (WSDM), Business Process Execution Language (BPEL) and general WS-* specifications provide rich security, identity, management, process orchestration, and monitoring functionality for deploying complex web services in a Service Oriented Architecture (SOA).