1. Field of the Invention
The present invention relates to a method and apparatus for lawful interception in general, and to intercepting web based messaging communication in particular.
2. Discussion of the Related Art
Lawful interception (LI) is generally aimed at capturing and analyzing as many as possible relevant communications of a target. A target can be a person, group of persons, an institute and the like, known to the organization and possibly posing a hazard to the organization or to society. The communications preferably include incoming and outgoing communications performed by or among one or more targets. Intercepted communications traditionally included mainly analog and digital voice communications. However, as larger parts of current communications are diverted to electronic channels in general, and to web based messaging communication (WBMC) in particular, the ability to automatically detect, capture and analyze such interactions becomes critical for law enforcement institutions and agencies. WBMC refers to all currently known forms, or forms that will become known in the future of communication between two or more users aimed at transmitting messages or information, which is materialized via the World Wide Web (WWW), including but not limited to web-mail, Newsgroups, Instant Messaging, chats, forums and others. WBMC interception is considered to be one of the more important sources for LI in data networks or IP networks.
Web-based communications passively captured by a law enforcement agency generally contain a majority of generally-available web pages which are of no particular interest to the agency. However, the agency is mainly interested in those pages that represent web based messaging communications. Nevertheless, automatically identifying web pages as WBMC, and analyzing them poses a challenge. WBMC can assume multiple forms as mentioned above. In addition, every such form can employ different formats and structures. For example, two sites providing mail services can have a completely different look and feel. Additionally, each service enables a user to send messages or information to a specific user or to an open community and to receive messages or information which is either directed specifically to the user, or to the open community, wherein the formats of sending or receiving messages is typically different.
Therefore, implementing an efficient and flexible LI capability, consisting of automatically recognizing and analyzing multiple forms of WBMC is not enabled with current technologies, due to the large variety of WBMC applications, formats and protocols, many of which are proprietary.
Adding to the complexity is the fact that new applications and updates to existing applications are continuously generated, making LI tools developed to cope with known applications practically insufficient or even useless.
Yet another complexity stems from the constant and frequent changes in available WBMC services, including adding, removing, or modifying such services, or merely changing their internet addresses, as expressed as Uniform Resource locators (URLs). Thus, there is a great difficulty in identifying a messaging communication, out of all the web-based intercepted communication.
Yet further complexity is caused by the different protocols and combinations thereof used to send or transmit mail messages. For example, attachments to web mail are preferably sent via file download/upload mechanisms, while the message header is locally built on the receiving side by JavaScript, and the message body is HTML.
There is therefore a need in the art for a method and apparatus for enabling efficient interception and analysis of WBMC. The method and apparatus should be able to cope with constantly changing applications, URLs, formats and other parameters associated with WBMC services.