Discovery is a process by which two parties in a legal proceeding exchange documents and exhibits according to specific rules of procedure. In a typical legal proceeding, a party (“requesting party”) may, pursuant to procedural rules, send a document request to another party (“responding party”) to compel the responding party to produce documents that contain any of many categories of subject matters. The responding party reviews potential documents, identifies documents containing any of the enumerated categories of subject matters, and produces them for the requesting party. Historically, the responding party reviewed paper documents, copied responsive documents, and produced them for the requesting party. Information technologies have encouraged companies to generate a large volume of electronic documents, and thus it is necessary to use an Internet review platform for the review of documents. In a typical document review, the representing law firm or the client retains a data company for providing data hosting service and retains contract attorneys (“the reviewers”) from employment agency to review documents on terminal computers (“client computers”). The reviewers can access the server of the review platform and download documents one by one for review. Document review is also frequently conducted for internal investigation.
A. Needs for Translations
If a case containing foreign language documents is to be decided in a forum using another language (“target language”), some documents must be translated into the target language for several purposes. Translations are used in case preparations. In the forum country, a majority of attorneys may be unable to read the source language. It is thus impossible for them to read original documents. They have to rely upon translations to prepare cases. When a case has more than a million documents, the cost for this component is very high. This need can be greatly reduced by retaining attorneys who can read and speak the source language. If the whole litigation team is able to understand foreign language, they would not need to have every important document translated for their use.
The second common use is using translations as discovery materials. In the United States, both parties need to produce relevant documents accompanied by translations. The scope and number of translations depend upon whether the counsel for the opposing party can read the source language although they may be subject to parties' agreement. If the required documents are in a foreign language, the party may be required to produce translations.
Another use is as supporting evidence in pretrial procedures. If there is a dispute in discovery scope, the party might have the dispute decided in motion. In such motion, the parties will attach translations to exhibits so that the hearing officers can look at the documents in deciding the motion. Exhibits and translations may be used for deciding any other side issues such as temporary restraining order, special appearance for challenging service of process, motions for all kinds of pretrial issues, motion for summary judgment, and motion for direct verdict notwithstanding jury verdict. For a large case involving complex facts, summary judgment motion may be backed up with exhibits consisting of thousands of documents and translations.
Translations are used to assist the counsel in negotiating for settlement. In settlement negotiation, the two sides argue why the settlements should be in their favors. To support their positions, they produce original documents together with translations. The counsel who does not understand the source language has to rely upon attached translations in evaluating settlement proposals. Apparently, the party that can produce favorable exhibits would be in a better situation to bargain for favorable settlement.
Another use is aiding the hearing officers or jury in deciding the case. Most judges do not understand foreign languages, but they need documentary evidence when they decide cases. Documents and translations are also used during merit trial. The documents are admitted as evidence. The hearing officers review the exhibits and decide the case. If a case has millions of documents, the party may need to translate some of them, which might be important to issues in dispute.
If a party loses the case and takes an appeal, the party needs to prepare appeal brief together with appendix. The appendix contains exhibits comprising mainly original documents and translations. It is possible that the appendix may require additional documents, which are not in the original trial exhibits. After the case is decided, it may be affirmed or remanded. If the case is remanded for deciding additional issues, additional exhibits with translations may be required.
B. Complex and Unique Document Compositions
Client companies make different products and sell different services. Thus their documents contain completely different substances. Despite their differences, they documents contain (1) information on a large number of projects, services, and processes, (2) strange codes or causal names of products, services, and materials, (3) a large number of players such as employees, customers, attorneys and consultants, and other parties, (4) technical subjects of varying complexity, (5) jargon, abbreviations, and acronyms, (6) assumptions only understood by those who were involved in the underlying transactions, (7) incomplete person names, place names, and discussion topics that can be understood only by those in the discussion group, (8) protected compressed and zipped files, (9) trade secrets protected by passwords, and (10) substance in one or more foreign languages. Due to any and all of the reasons, document review is not an easy task.
Corporate documents contain a large number of duplicates. Duplicate documents arise from document distribution practices, archiving, file backups, drive backup, media backup, and server backup. A document may be distributed to several, tens, and hundreds of employees. The some documents may be amended and again sent to a large number of employees. Each of the documents in an individual employee possession may be again backed up in many ways. Certain documents may have thousands of copies while others may have only tens to hundreds of copies. The large number of documents is primarily responsible for the high cost.
Due to the large number of software applications for creating documents and complex file histories, some documents cannot be properly processed for review. Documents cannot be opened due to (1) lack of a supporting application, (2) association with a wrong application, (3) missing necessary components, (4) being linked to an unavailable file, (5) incorrect encoding in the texts of foreign languages, (6) corrupted file structure, (7) infection by virus, and (8) lost part of information or damaged file structure. It is easy to name potential causes, but often difficult to ascertain whether a document has a real technical problem. When a great number of documents cannot be opened, it is a disaster. The only possible solution is to find original documents. Documents incorrectly marked as having a technical problem may be routed back to reviewers for another round of review. Two or three rounds of attempts can incur a great deal of costs.
Encoding problems in foreign language documents add another layer of complication. Many large corporations are doing business worldwide. Their corporate documents are written in different languages, depending upon the geographic region where the documents are created and what are the intended readers. Some documents are written in foreign languages, others contain foreign languages between lines, and yet others contain English translation. Some documents may be written in more than one language with internal cross-references. It would be very difficult to have those documents reviewed. They go through several rounds of reviews. For the reason obvious, this kind of documents cannot be properly reviewed in several rounds, one for each of the languages. If such documents are important, they are translated to the target language.
Password protection of documents adds further complications. Passwords protected documents often appear in the documents of software companies and technology companies. This class of documents can significantly reduce review speed. It is often difficult or even impossible to find right passwords. In many times, the reviewers treat such documents as trash or technical documents. The parties in civil litigation may reach an agreement on how to treat those documents. Now companies use zip files to send documents by email. A zip file may contain tens to hundreds of files. Some zip files contain database dump files, large number of forms and templates, all files for a complete project, and routine spreadsheets. An attempt to deal with the password problem can consume a great deal of time. An operation from file selection, downloading, to unzipping the file can waste as much as 10 minutes per document. If a reviewer is still unable to open a document, the reviewer waits for help or repeatedly tries the same operations. The time wasted from this problem is difficult to assess. Documents routed to a wrong destination will be routed back and forth without final resolution.
C. Litigation Dynamics
Document production, which includes translation, is further complicated by unpredictable but routine changes inherently in litigation. All current review models lack the ability to deal with changes. For a small case handled by a single lawyer, any change to any aspect of a review production is already a headache problem. In a massive document review project, any change means a huge costs and a great deal of delay.
Constant and routine changes in litigation are in a head-on clash with the constraints of the review model. In many times, even if the client can pay for a huge cost, it is simply has no time to make required changes. Litigation in the adversary system by nature is a contentious game, and the purpose for making changes is to increase the chance of win and reduce the chance of loss. However, everything else in the document production model is against any change. One biggest impeding factor is the large number of documents. Naturally, all law firms have the needs to change review instructions concerning review standards, request definitions (specification definitions), coding rules, and methods of handling of documents. In reality, discovery is a trail and error process that is characterized by changes, adjustments, fixes, quality checks, corrective reviews, and special reviews. In situations where any change cannot be applied to a portion of the documents due to practical difficulty, the review team has to review all documents. This requires a great deal of review time. In other situations, any change may affect a sub set of documents in the review pool.
One of the many complicating factors is the number of players. For any review, the players may include client employees, litigation attorneys, project managers, document processors, staff recruiters, document reviewers, and technical consultants. One single misunderstanding by any of the players may result in an error that might require a massive corrective review. Another complicating factor is the huge amount of case information. When a change is proposed, it is impossible to foresee how the proposed change will affect documents through its direct effects or its unforeseen interactions with one or more case facts.
Finally, many changes, even though they are purely litigation decisions, cannot be successfully implemented without the support of review platforms. When a proposed task is to find and review a set of priority documents in order to meet a deadline, one question is whether the review platform can competently identify the set of documents. Platform search capability, algorithm designs, file formats, file types, file conditions, files processing histories, and the way of organizing information in the documents affect the chance of success. Even the work habits of the reviewers may be a differential factor. Some reviewers may be able to successfully make a change while others may give up. Although experience may be the most valuable thing to predict the chance of success, no one can guarantee any type of outcome in a system with too many variables. A very sound change plan may be easily defeated by a surprising factor. If all factors can be considered independently, the problem may be not frightening. In many times, a change may be impeded by a battery of main factors such as review software characters, internet connection characters, review computer characters, server characters, file characters, file processing histories, reviewer's working habits, and the sizes of affected documents. Each main factor may comprise tens to hundreds of sub-level factors and they may be intertwined with each other. This explains how a law firm can actually spend tens of millions of review fees on a typical review project.
D. Common Translation Problems and Reasons for High Costs
Translation is conduced while a plurality of reviewers review documents. The background in reviewing documents has been described in published patent documents.
When a reviewer runs across a document and determines it is important enough to have it translated, the reviewer puts a note in a comment field, places a check mark in a translation flag check box, or sends a message to a litigation attorney. The litigation attorney decides which document should be translated into the target language. The reviewer may conduct translation immediately for a small document or conduct translation thereafter. A log in an excel file or word file may be used to track all translations.
There are many reasons for high document review costs. When documents have foreign languages, the billing time for translations is responsible for much more discovery costs. There are several causes for high translation costs. One of common causes is the large number of duplicate documents that cannot be eliminated. For example, a particulate document is created by distributing a draft to a group of members for review and the drafter will get each of edited documents for inclusion in the final version. This process can be repeated many times in several months to several years. This practice may result in hundreds to thousands of copies that are not identical. All those copies may be assigned to a large number of reviewers. While each reviewer can identify the most inclusive copy within its small review range or folder, but there is no easy way for the whole reviewer team to figure out which one should be translated. Tens of reviewers might tag the some documents for translation. The project manager or litigation attorney, especially if he cannot read the source language, cannot determine whether those documents are identical based upon the review note. If some of the documents have been translated, the attorney cannot tell whether the translated documents are identical because all translations look somehow different. This is what the inventor refers to as “term multiplicity” and “structure multiplicity”. When there are several litigation attorneys and a group of reviewers, they can routinely spend a great deal of time to figure out which documents have been translated and which need to be translated.
Good translations require a great deal of time. Translation time in terms of dollar amount per word can vary by many folds, depending upon nature of original documents, accuracy requirements, translators knowledge, and translation philosophy. A decent translation of a document requires more knowledge than what is required to paraphrase text in the document and far more than what is required to code the document. Translation of legal documents is never a job that anyone can do. When accuracy is not required and the subject in the document is very general, Google and Ping translation may be good enough and the reviewer just takes a look at the machine-translated text to ensure that it does not add harmful texts. If accuracy is not required and if the reviewer has good background, the reviewer can translate the document in nearly typing speed, and still achieve moderate accuracy. When the document is highly technical and the reviewer does not have required relevant knowledge, the reviewer has to struggle, but still cannot deliver required accuracy. When the document contains highly technical matters and also requires the highest accuracy, the document may demand the highest skill levels and require a great deal of time. If the original document is highly technical, concerned with critical issues, and contains some problems (such as bad handwriting, causal notes, missing pages, and errors and omissions), it would be a research project. The translation may require several rounds of amendments. A challenge by the opposing party may require further amendment. When the document pool contains a large number of hot, technical, and troublesome documents, and there is no good method for managing the work flow for all attorneys and reviewers, translation costs can be further increased.
One of several additional factors affecting translation costs is translation method. Translation methods include verbatim translation and translation by meaning. For each of the methods, there are still various subtle differences as to translation methodology. Verbatim translation can be performed much faster because the translator does not need to fully understand the content. It can be done by sentence-by-sentence replacements according to different grammar rules. The risk is that the resulted translation may loss intended meanings. Translation by meaning is much difficult especially in litigation settings. If the translator does not have the knowledge of underlying technology, the translator has to learn the subject matter during translation. If the subject matter is very complicated, this learning process can take much more time, and the translator still can make mistakes that a skillful translator would immediately know. There are also situations where subject matters are so complex that they are not for laypersons to translate. If the translator knows underlying technologies, she can learn the subject matter much more quickly, and can have better sense to avoid making mistakes. A right translator is in a much better position to ascertain implied assumptions, incorporated external facts, original mistakes and omissions, and uncommon expressions in original documents.
One factor that can also contribute to the high costs is the total lack of tools for translations. When a foreign language is so big, no one can know everything. For example, Chinese is a very big language that has evolved for more than four thousands years. It has such a long evolving history that it requires “translation” for readers in different eras. It has a huge numbers of character combinations. It is so big that it requires “translation” for readers in different technical fields. On top of that are rich cultural, social, political, historical, and geographical elements. Even just human's names, location names, and company names can totally disrupt work flow to nearly a halt. No one knows all official counterparts of all company names, location names, and people's names. No single person can ever know everything about such a language. The translator may run into something that requires mini research. If separate public Internet access is not provided, the translator may have to use sound translation rather than officially recognized counterparts, and may have to spend much more time to fix inconsistencies and errors after the translator knows better translations. If no tool is provided and if the translator is unable to understand something, the translator has to consider whether it is related to an implied assumption, incorporated external fact, an original mistake, or unusual expression. Such attempts may help the translator to find the right answer.
The tools for foreign language data entry affect translation efficiency. Data entry may be necessary for finding right foreign documents, conducting backward translation for verification, and creating text in translations. The Windows' data entry method has limited functions and most reviewers do not like. Each reviewer may be good at using one particular method. Their productivity in typing foreign language texts may differ by hundreds of times. The data entry methods for certain Asian languages can affect data entry efficiency dramatically. If backward translation is necessary, a reviewer may be able to perform very well by using one data entry method, but cannot work at all by using another method.
The number of foreign languages in a case also dramatically affects review productivity. Some documents may be created in more than two languages in alternate. Such documents will need many rounds of reviews. Strictly speaking, competent review cannot be performed by multiple independent reviews because this review model is unable to ascertain combination effects. Each of the sections may be non-relevant when it is reviewed separately, but their combination may present a significant issue. Demanding one reviewer to review multiple languages is also troublesome because there is no guarantee that the reviewer is really competent to review all languages. If a document in many languages is a product of one single author, it is questionable whether the author is able to convey objective meanings in all the languages.
Amount of case information always affects translation costs because each of the reviewers must learn it. When the amount of information is doubled, the time spent on the learning process for all reviewers is also doubled. The large amount of case information, numerous file types, and common technical problems may be intertwined to further increase translation costs. Poor review plans, lack of background knowledge, insufficient experience, incomplete and confusing review instructions, and missing support applications on review computers are among other factors that may contribute to high costs.
Great effort has been made to reduce discovery costs in the review industry. Certain search and file elimination methods may disrupt what the inventor calls cross-document verbal context and transaction context, and make some critical documents unavailable. Such search methods will make translation tasks more difficult or force translators to make best guess. Some computer search methods can reduce documents by as much as 80%. This may reduce the number of documents to be translated. The deduced size of the document pool can reduce the total production cost, but it may reduce accuracy of translations if it affects the verbal and transaction context or make some translations incomprehensible.
E. Relevant Experience and Learning Process
In a typical review, reviewers start learning basic case information. The learning process for experienced reviewers is different from that for inexperienced reviewers. All reviewers have to learn basic case facts, review instructions, and review software. Experienced reviewers can go through this process faster because they do not need to learn every detail. They only need to learn case facts and the unique or different aspects of review procedure, background law, substantive instructions, review platform, tags structure, and coding conventions. In a second request review, experienced reviewers might have known most of the two dozens requests. They only need to learn those unique and distinctive requests, and they are familiar with most of concepts such as market shares, sale prices, costs of saving, cost and benefit analysis, and most antitrust sensitive issues. They also know the basics for conducting responsiveness and privilege review, and thus do not need to spend time to learn everything and develop new skills for applying requests to documents. They may know short cuts for conducting relevancy analysis and privilege analysis. It is far less likely for them to make fatal errors under reasonable review speed. In comparison, new reviewers have too many new things to learn. New things include case facts, review procedure, background law, review instructions, review platform features, tags structures, coding conventions, analytical methods, and handling platform problems. They need to develop basic skills for conducting legal analysis, applying document definitions to documents, and performing complex analysis. They may make a coding error as a result of using a wrong approach in conducting legal analysis or failing to realize important facts.
All reviewers cannot reach their full potential in all reviews. One reason is that they cannot master everything. Their workflow may be interrupted because they have to address less frequently encountered facts, terms, expressions, things, people names, and place names. If a company has used two thousand of attorneys, a reviewer can remember one hundred names, which appear frequently. The reviewer is unable to remember the remaining one thousand and nine hundred attorney names. Whenever the reviewer encounters those unfamiliar attorney names, the reviewer needs to check them against a names list or to figure it. In addition, they have to sporadically deal with issues such as illegible documents, handwritten notes, foreign languages, compressed files, missing passwords, large spreadsheets, database files, and defective encoding. This explains why their performance curves level off.
Experienced reviewers have their own peculiar “liabilities.” Due to insufficient review guidelines, experienced reviewers may import the meanings of special terms such as responsive, significance, privilege, and technical issues into the current project. Importation of different interpretation rules can directly compromise review objective. Tagging logic and coding conventions are different from sites to sites, and written review manuals seldom provide sufficient details to alert the reviewers to their unique coding logic. Review manuals may contain many interpreting gaps. Experienced reviewers may fill the gaps with what they know. They might port into the current case their prior procedures, substantive definitions, and interpretation rules, coding rules, and tag configurations. As a result, they might code documents contrary to site requirements. The errors are generally not the kind of errors that can post risks to the client's cause.
In some review projects run by new associates, quality control data often reveal that experienced reviewers perform worse than new reviewers. There are several reasons for this noted “poor performance.”
The first reason is their differences in interpretation philosophy. Experienced reviewers tend to read requests more narrowly and pay more attention to substance. Thus, they exclude more documents in a document production for an opposing party. New reviewers and new associates tend to read definitions more broadly and pay more attention to the requests' literal meanings than its substance. Experienced reviewers, especially those with solid litigation background, may exclude documents that merely mention buzzwords without real substances. They might exclude hundreds of types of documents. By reading requests literally, the requests can squarely read on those documents. However, the documents are not the kinds of documents the request drafters would need. If one of the documents were coded as privileged, the substance in the document would be insufficient to fill a defensible log entry. By using this literal relevancy standard, the manager would regard many coding decisions as errors.
Over-inclusion of non-responsive documents is a prevalent problem under the current review models. The Department of Justice returned documents on the ground that the production contained too many irrelevant documents. An incidence like this clearly suggests that relevancy should be determined based upon document substance at least to some extent. By using different interpreting philosophies, new reviewers can achieve better consistency but experienced reviewers may achieve low consistencies. This also explains why high school students can achieve high consistencies when they are asked to code documents according to a list of definitions in a few simple steps. Young students can perform better in doing most simple manual tasks. When quality control staff also takes the literal approach, experienced reviewers will be the minority.
The second reason for devaluing review experience is that the current review model is unable to utilize the reviewers experience and knowledge. For a corporate client conducting business in multiple industries, its manufacturing products touch many fields, and so do its technologies. Corporate documents may include executive's elegant speeches, counsel's sophisticated legal analysis, sales staffs routine marketing materials, all kinds of complex secured transaction files, personal informal email, various legal instruments, hard-to-understand financial records, R&D experiment reports, and quality control test data. As diverse as corporate documents are the backgrounds of document reviewers. The reviewers may have majored in literature, history, business administration, secured transactions, accounting, life science, physical sciences, chemical engineering, mechanical engineering, software and information technology, electrical engineering, and medicines. By using the current assignment methods, documents are processed by custodians. Same or similar documents are assigned to many reviewers randomly, just like lottery number balls to be blown out of a drawing vent to land in review folders or ranges. Most documents in their folders are not relevant to their experience and knowledge. In addition, they review documents out of context and thus cannot understand special, implied, omitted, and misspelled terms which appear in abundance. Naturally, every reviewer codes documents by best guess. What they are actually doing is to “classify” documents based upon what they can understand from the documents. In conducting this kind of cursory review, experience may be a waste.
For translation of documents for litigation purposes, the length of translation experience may have little relevancy. If a reviewer lacks relevant technical knowledge and required technical strength necessary for understanding a particular document, he is unable to produce accurate translation.
F. Translation Performance
All foreign language cases can be classified into three types on the basis of their requirements for review accuracy: (1) low or no requirement, (2) moderate requirement, and (3) very high requirement. In certain matters, document production may be a formality matter. In some merger cases where the final combined market share is still far below 50%, a document review may be a matter of process unless there are other antitrust issues. If the documents do not contain other risky subjects, high school students and even computer algorithms could do the job. In this type of cases, translation would not matter unless it is so far away and that it hits some hot buttons. A majority of cases do require reasonable accuracy. In this class of cases, final disposition depends upon their documentary evidence. The parties win with evidence, and lose for evidence. Translations act as the critical middleman for passing case facts from original documents to the hearing officers. When both sides do not have solid evidence to back up their claims and defenses, they dispose of the case by the usual settlement. The final settlement price most probably depends upon their relative strengths of documentary evidence including translation quality. The third class of cases requires very high review accuracy. In this class of cases, the stake may be millions to billions dollars of punitive damages, triple civil damages, twenty years jail times, and even person's or company's right to exist. Those cases include securities class action, product liability action, high-profile patent infringement action, criminal prosecution, and violation of sensitive statutes such as Foreign Corrupt Practices Act and Export Control Law. When the middleman tell a wrong story or distort facts, the results can be easily imagined. The method of present invention is primarily intended for the last two classes of cases.
On some review sites, helpful information is posted on a blackboard or clipboard for sharing. This effort is intended to identify coding and translation problems. Discussion meetings may be conducted on a daily or weekly basis. This method is, however, ineffective and inconvenient. Oral communication is ineffective to discuss subtle coding and translation issues, and cannot be used to share complex facts between reviewers. Some review sites provide a questions-and-answers forum, where the reviewers provide questions and project managers provide answers one or several days later. Sharing information by using Windows' share drive has also been used as early as the birth of the operation system itself. However, this method presents several problems. First, the arrangement does not allow plural reviewers to write information to the same source and the operating system may lock up the file when one reviewer opens the file. To avoid this problem, each of the reviewers is allocated a time slot to enter questions. It can waste a great deal of administrative time in scheduling and working around allocated time window. Second, such a method cannot be standardized to implement powerful functions. Different cases require totally different ways of organizing and sharing case information. Finally, there is no suitable way to ensure that all information posted is accurate and reliable. Posting a piece of wrong information for sharing may cause other reviewers to make a wrong coding decision. As a result, only project managers and litigation attorneys can answer such questions. The law firms do not want to use such method to share elementary facts that may control coding decisions in many related documents. Questions-and-answers could be implemented by email, email attachments, web pages, or web page attachments. However, it is seldom used for similar reasons. It cannot be used to share elementary facts in real time, and there is no proper way to ensure data accuracy.
Translations for discovery projects inherently discourage the use of new technologies. Translation memory system, which can store translated materials as subsequent use, cannot be used due to its very high deployment cost, risk of sharing or leaking confidential and sensitive information, and potential recycled errors, a well known problem. When translation is performed by off-site vendors, translations are conducted in a context-deficient environment, inevitably resulting in massive distortions and disastrous mistakes.
G. Prior Art Translation Tasks Tracking Method
One important thing in the management of work flow is tracking which documents have been translated and which documents should be translated. In a single super lawyer model, one single attorney reviews all documents and then decides which one should be translated. It has no issue. However, in a complex presentation model, there are several litigation attorneys and tens to hundreds of reviewers and/or translators, it is never be an easy task to manage work flow. Two common factors make this seemingly straightforward task very difficult. Most litigation attorneys are unable to read documents in source language, and, due to what the inventor calls as “translation term multiplicity” and “structure multiplicity,” the attorneys are unable to determine whether two original documents are similar or identical by looking at their translations. An original document may end up with different translations even if the same translator did all translations. When different attorneys and a group of reviewers and/or translators work at different locations, the process of determining documents would consume a great deal of time. While law firms might use some simple tools to keep track of translations, it is a trial-and-error process. It is inevitable that many similar or identical documents have been translated many times, while some important documents do not get attention. Even though, every project is concluded successfully, it has a huge burden to the managing attorneys and a huge bill to the client.
Any translation contains commonly known problems including imperfections, approximations, unavoidable distortions, and even human errors. Those things may have sufficiently serious impact on outcome of litigation. There are no known methods for addressing translation term multiplicity and structure multiplicity.