Historically, a key problem in protecting copyrighted works online has been that, even if the user is forced to read something using a particular program, the letters on the screen can always be captured by the user "screenscraping" and then reusing the material without the publisher's knowledge. As a result, publishers are often afraid of providing online information for fear that it will be stolen and reused. Computer scientists give talks with phrases like "the nationwide file system," and publishers fear that this means that everyone in the country, or perhaps the world, will be able to access one electronic copy of their material, and as a result, the first sale will be the only one.
Further, publishers remember the collapse of the software game market in the early 1980s due to illegal copying, and are aware of the software piracy problems in many countries. The Software Publisher's Association estimates that piracy costs the software industry $15 billion per year. See, for example, Elizabeth Corcoran, In Hot Pursuit of Software Pirates, Washington Post, Aug. 23, 1995, at F1. In some countries, more than 95% of the copies of software in use are illegally made, with almost no legal copies in use. Worse yet, others are re-exporting illegal copies, cutting into the software market in places that have made great progress at eliminating locally made illegal copies. Book publishers are afraid of the same thing happening to them and are looking for a technological fix.
Traditionally, the large databases have avoided the worst of these problems by their method of operation. Users of a system like NEXIS or Dialog may well be able to capture a screenfull of output, representing a single article or abstract; but they are not likely to be able to find somebody else who wants that particular item and is willing to pay for it. Thus, piracy of such a system, which provides only the tiniest fraction of its content on each interaction, is not an insuperable problem.
Book publishers, however, who would like to deliver an entire book online, do not feel that same confidence. They know that an undergraduate who has purchased the right to a textbook for one of his classes, for example, has easy access to his fellow students who also need the same textbook, and they fear unauthorized duplication. Current statistics on library theft, for example, would not reassure them as to the honesty of current undergraduates. These concerns for online display are also applicable to display provided by a compact disk, read-only memory (CD-ROM).
The techniques of "digital watermarking" or "steganography" are often recommended to deal with this problem. These methods involve the concealment of a special code within a digital object; this code does not interfere with reading or viewing the object, but can be used to track copies. See N. Komatsu and H. Tominaga, A proposal on digital watermark in document image communication and its application to realizing a signature, Electronics and Communications in Japan, Part 1 (Communications), vol. 73, no. 5, 1990, at 22-23.
With watermarks, each copy sold is labeled with a different identification number, and illegal copies can thus be tracked back to the original purchaser so that legal remedies may be sought against that purchaser. However, these codes may be easily removed, and they may be hard to insert. See K. Matsui and K. Tanaka, Video-Steganography: How to Secretly Embed a Signature in a Picture, Technological Strategies for Protecting Intellectual Property in the Networked Multimedia Environment, January 1994, at 187-206. Unfortunately, even if the proposed law against protection-breaking software passes, it is going to be hard to outlaw low-pass spatial Fourier transforms, which would attack many of the proposed picture labeling schemes. Further, it is easier to find spare places to put extra bits in a picture (where low-order detail can be adjusted with little notice) than in text, and thus the watermark method has little use with text.
The most serious difficulty with digital watermarking for those who wish to display text is that it is particularly hard to find a way to put such codes in ASCII text. For this reason there have been suggestions that publishers wishing to send ordinary text should code it as bitmaps and send those despite their greater bulk. See M. Lesk, How Can Digital Information Be Protected, Research Challenges in the Economics of Information, CNRS, July 1993. Perhaps the most imaginative solution is that of J. T. Brassil, et al., at Bell Laboratories, which used such techniques as adjusting the space between letters and words in Postscript to hide extra codes. If the user copies the exact bitmap, this will permit tracking of the copy. See J. T. Brassil, S. Low, N. Maxemchuk, and L. O'Gorman, Marking of Document Images with Code-words to Deter Illicit Dissemination, Proc. INFOCOM 94 Conference on Computer Communications, 1994, at 1278-87.
Digital watermarking also does not actually prevent copying; it merely makes it possible to track the source of an illegal copy back to the first purchaser. However, since the Business Software Alliance estimates that 90% of illegal software use in the United States is unorganized and individual, any legal action against the copiers may well be expensive and unrewarding. See Steve Lohr, Pirates Are Circling The Good Ship Windows 95, New York Times, Aug. 24, 1995, at D6. It is for this reason, of course, that the proposed White Paper on "Intellectual Property and the National Information Infrastructure (NII)" suggests that online service providers be responsible for policing copyright violation; however, it remains to be seen whether this will be enacted. For the moment, legal techniques are expensive when used against a myriad of individuals, many of whom may have small financial resources.
Another technique used in the software industry to control illegal use of software is the "dongle," a special-purpose hardware device that must be present on the machine involved for the software to work. These devices are relatively cheap and prevent software from running on a different processor. However, these devices meet some consumer resistance since they prevent someone from moving their work quickly from one machine to another (as is of course their intent) and also just seem to produce a higher level of hassle for users than software alone.
Although transmitting bitmaps and controlling the software and or hardware that is used to display them help prevent people from copying the material they access, these methods do not provide a complete solution. The user can always just take a screen dump, capture the bits off the screen, and run these bits through an optical character recognition (OCR) program. The publisher will then be thrown back on legal remedies for copyright violation.