This disclosure relates generally to network forensics, and in particular to utilizing a parallel pattern matching processing unit to perform pre-classification of packets in a packet stream for real time classification and recording of network traffic.
The field of network forensics involves, among other things, various different possible methods of discovering and analyzing the contents of packetized data transmitted over a network. Identifying particular forms of data (e.g., a motion pictures experts group (MPEG) file, a voice over Internet protocol (VoIP) session, etc.), as well as the content of a particular form of data (.e.g., the actual audio file encoded pursuant to the MPEG standard, the audio related to the VoIP session, etc.) transmitted over a network can be a time consuming and computationally intensive task. Such identification may be particularly time consuming and computationally intensive given the rate and volume of packets that may be transmitted over a network.
If packets are recorded for subsequent examination or searching (as is practiced in network metric, security and forensic applications), then identifying a particular form of data and extracting the contents of the data may involve first searching an entire database of packets, possibly 10 s, 100 s, or more terabytes of data, to identify any data possibly conforming to the search request. Such a search may not be conducive to practical, real time discovery and analysis of data types and contents of interest.
Packets may be analyzed and indexed in a database as they are being recorded. By forming indices based on packet characteristics, metadata, and locations where the packets are recorded, identifying and reporting on a particular instance of data and extracting the contents of the data may be performed by searching the indices instead of the entire database of packets. This approach may reduce time and computation required to search. However, this approach may also increase the time and computation required to record the packets.
Recording packets of a network with a high rate and volume of such packets can be a time consuming and computationally intensive task. Time and computational resources needed for real time analyzing and indexing while recording packets may not be feasible. Even if such real time analyzing and indexing is feasible, the amount of analysis that is possible may be limited.