1. Field of the Invention
The invention is generally directed to information technology with features of network switching, routing, proxy, and database technologies, and, more particularly, to the extraction of semantic data, via a network tap that provides a (possibly incomplete) copy of traffic between two network agents with no substantial modifications to the existing network or application infrastructure.
2. Description of the Related Art
Over the last few decades, structured database technology has become a critical component in many corporate technology initiatives. With the success of the Internet, the use of database technology has exploded in many consumer and business-to-business applications. With the popularity of database architectures, new risks and challenges have arisen. Such risks and challenges include complex and difficult to identify performances issues and subtle gaps in security that can allow confidential data to be accessed by unauthorized users. Accordingly, what is needed are new, improved mechanisms for identifying these performance issues and closing these security gaps.
A large fraction of database applications use a database server which has structured data stored and indexed. Clients access the database server to store, update, and query the structured data. The clients may communicate with the database server using standard networking technology, such as Transmission Control Protocol (TCP), Internet Protocol (IP), Ethernet, and the like, using various physical or virtual media. While standard protocols are generally used for the lower levels of communications with the database server, higher-level protocols are often specific to a vendor and/or client-server architecture, and may not be fully specified. Vendors may not be technically able to publish these specifications, or may choose not to publish these specifications for other reasons.
Below the application and/or database layer, a sequenced byte protocol, such as TCP or Sequenced Packet Exchange (SPX), is generally used to ensure delivery of messages between client and server systems in the face of potentially unreliable lower-level transport mechanisms. These protocols may exchange multiple packets to deliver a single byte of data. The transmission and/or reception of such packets may be asynchronous, such that the order of the packets is not necessarily the same as the order of the byte stream required by the application or database layer. These protocols are designed to work when packets are lost or corrupted between two network agents, such as a client system and server system.
Many network sessions may be established between a server (e.g., database server) and one or more client systems. Generally, each session operates asynchronously with respect to the other sessions, and the data and control information from a plurality of sessions may overlap temporally. In addition, multiple encapsulation technologies and physical layer technologies may be used between a server and its clients.
There are a number of network-tapping technologies that can be used to extract a copy of the packet stream flowing between two or more network agents. However, a network tap attempting to observe an exchange will not witness an exact copy of the traffic as seen by either network agent. Rather, the network tap will receive a unique third-party view of the packets, which may comprise a subset or superset of the packets seen by the network agents.
While many uncertainties, as to encapsulation, session multiplexing, order, and validity of request data, may be resolved using data embedded in underlying protocols and transports, these mechanisms are designed to operate at either end of a network conversation (i.e., at the network agent). Furthermore, this embedded data is not able to fully resolve uncertainties in the actual content of a specific network conversation. In addition, in commonly used network architectures, the packet stream captured by a network tap is frequently damaged in some way. Moreover, the application protocols (e.g., Oracle's client-server protocol) are often not publicly specified. Thus, conventionally, it is impossible to derive full details of operations between a server and its clients using a network tap.