Modern multi-tasking computers provide a variety of user interfaces for controlling multiple application programs and system functions which operate simultaneously. Some of the most widely used multi-tasking computer systems are personal computers (“PC”) running a multi-tasking operating system (“OS”) such as International Business Machines' (“IBM”) OS/2™ or AIX™, Microsoft Windows™, and Apple Computer's MacOS™. Other operating systems may be used with personal computers as well as larger computers such as enterprise-class computers, such as UNIX, Sun Microsystems' Solaris™, Hewlett Packard's HP-UX™, and the “open sourced” LINUX. Smaller computing platforms such as held-held computers, personal digital assistants (“PDA”), and advanced wireless telephones may run operating systems targeted for such hardware including Palm Computing's PalmOS™ and Microsoft's Windows CE™. Additionally, there are many “proprietary” and less widely-used computing platforms and operating systems which also allow users to control and run multiple programs and system functions simultaneously.
Many of these systems will use tabs, icons, windows, frames, pages and special key combinations to allow a user to switch between user interfaces (“UI”) for each program and system function which is being executed, or to start or stop the execution of a program or system function. For example, in a personal computer running MS Windows™, the user may first start a web browser program running using several methods (e.g. double clicking an icon on the desktop, selecting the program from a Start Programs list, operating a “hot key”, etc.), and then may start a document editor program using similar methods. Each program establishes a user interface such as its own “window”. The user can then control a program by selecting its window using one of several available methods, such as selecting a button or icon on a command bar, activating a “task list” and selecting a program, etc. As a result, a user can start and run many programs simultaneously, periodically switching between their user interfaces to accomplish work or entertainment tasks as needed. Other computing systems provide similar basic user control capabilities, albeit with a variety of user controls to switch between programs and system functions.
Users often wish to copy or transfer information or “content” from one program or system function to another. For example, a user may be preparing an invoice for a client using a word processor program, but may also be simultaneously using a database or spreadsheet program to perform various calculations. Using “copy and paste” functions of the application programs and the operating system, the user may select information from a source program (e.g. the spreadsheet), and “paste” it into the destination program (e.g. the invoice being edited). Such a process is so common place in computer users' daily lives that it is rote, albeit each user may know several sequences of actions for several computers which he or she commonly uses (e.g. one process on his home PC, another on his PDA, and another on his networked terminal at work). These memorized methods may typically include several steps of clicking on icons, dropping down lists, highlighting information, and using navigation controls within program UI's.
For example, turning to FIG. 1, a “windows” style user interface is depicted to illustrate a process of “copying” information from a web browser program to a word processor file via a “clipboard” memory. In this system, each program provides a window (2, 3, 104) which can be closed (9, 9′) to end the program, maximized (8, 8′) to view the full UI for that program, or minimized (7, 7′) to leave the program running but deactivate the UI (e.g. clear the UI window from the screen). In this example, these controls are located in a command bar (4, 4′) along the top of the UI window, but many other variations are known in the art.
Each UI window also typically has navigation controls such as left panning (15, 15′), right panning (13, 13′), and horizontal scroll (14, 14′) controls, as well as up panning (10, 10′), down panning (12, 12′), and vertical scroll (11, 11′), for viewing areas of information and content not completely viewable in the UI. Information, icons, text, graphics, etc., are shown or displayed within (16, 18) the UI window according to the scroll and panning control settings. More recently, the term “content” (16, 18) has been used to collectively refer to all types of information which may be displayed or presented in a user interface, including but not limited to text, graphics, still images, animated images, video, audio, and hyperlinks.
Now suppose for the purpose of our example, the user has started a word processing program which provides a first UI window (2), and a web browser which provides a second UI window (3). Also suppose that the user is researching information on the Internet using the web browser while authoring a paper which is being edited simultaneously using the word processor.
In this example, the user has found information (19) at a hypothetical web address (17) that he wants to “quote” in his or her paper. So, the user must first move the cursor (104) in the word processor to select an insertion point for the information, then must switch to the web browser UI, select the text (19) in the source content, operate a “copy” command in the web browser UI which copies (101) the content into a buffer (100) such as a “clipboard”, switch back to the word processor UI, and operate a “paste” or “insert” command, which results in the copied content (19) being inserted into the destination document at the point of insertion (103). The user can repeat this process for many different program UI's (106).
In some software and hardware configurations, the copy buffer may be provided within a suite of application programs which are “tightly coupled” or related. Such suites cooperate with each other in ways not possible with software programs provided by differing suppliers. In many cases, however, the operating system provides a buffer function which is generally accessible by all programs, such as the clipboard in the MS Windows™ operating system.
Also, in some situations, the original content with its original format may not be acceptable by the destination program, and as such, a specialized paste or insertion function (105) may be provided by the destination program or operating system which converts the content to a form useful by the destination program. For example, text copied from a web page may include color, size, font, style, and hyperlink reference information embedded in the base Hyper Text Markup Language (“HTML”) of the source web page. However, not all word processors are able to interpret all of these special codes and identifiers, so a “paste as plain text” option may be provided by a converter or translator (105) function.
So, to illustrate the complexity and tedious nature of such ordinary operations, we present the steps in full to accomplish this example scenario of simply transferring a block of formatted text from a web page to a word processor program, starting from a point where the user is editing the destination document in the word processor:                (a) navigate to the insertion point in the destination document using the word processor UI window controls (e.g. multiple clicks on scroll, panning or page up/page down keys);        (b) optionally select text or content in the destination document which is to be replaced;        (c) switch to the web browser UI window (e.g. click on an icon in a task bar, activate a task list and pick a running web browser program, etc.);        (d) navigate in the web browser UI window to find the text or content desired to be transferred into the document (e.g. use panning, scrolling, or page up/page down keys);        (e) select the source content or text (e.g. click-and-drag over the content to highlight it)        (f) transfer the content to a copy buffer (e.g. click on “Edit” command, select “copy” option or type Alt-E, Alt-C);        (g) switch back to the word processor UI window (e.g. (e.g. click on a icon in a task bar, activate a task list and pick a running web browser program, etc.); and        (h) operate a “paste” command in the word processor UI window (e.g. click on “Edit” command, select “paste” option or type Alt-E, Alt-P).        
Each of these operations may actually require several steps (clicking, scrolling, selecting, typing, etc.), so this minimal process may represent 7 to 25 actual user actions. This process must be repeated for each block of text or content to be transferred from multiple program UI windows (106), and additional steps may be necessary to achieve a “special paste”, as described above. Also, if the same text or content is to be inserted into the destination document for file at multiple locations, the last few operations of this process (h) in addition to some navigation actions must by performed by the user.
As a result, consolidating information from multiple sources of information may be extremely tedious, frustrating, and tiresome using the currently available methods and apparatuses provided in such computing systems. Some systems may provide notably more “user friendly” or intuitive methods, while other systems are much more difficult and “clunky” to use.
Turning now to FIG. 2, this type of process is generalized. Starting at a point or time (21) when the user is actively working with the destination program UI, the user must navigate (22) within the present document, file, or other computer resource to a point where the content insertion is to be made, including selecting any content which is to be replaced. Then, the user must switch (23) to the UI of the first source of information, navigate (24) to the first source content to be transferred, select that content, and operate (25) a copy or cut control in the first source UI.
Next, the user must switch (26) back to the destination UI, and operate (27) an insert or paste command in that UI. If (28) the user wants to insert or paste that content into multiple destinations, the user must navigate (29) to each destination and operate (27) the paste or insert command in the destination program UI, until all insertions have been made for that source information.
If (200) the user desires to transfer information from other points in the same source, or from other sources, the user must repeatedly switch (201) to a source UI, navigate to a source content point, select source information, operate (25) a copy or cut operation, switch (26) back to the destination UI, and paste or insert (27) the content, until all information has been transferred.
Implied, but not shown in detail, in this generalization of the process can be multiple user actions for each general step. Optionally, options such as conversion of the content may be necessary, which requires additional user actions (e.g. the “paste as plain text” example).
So, it is possible that in the course of authoring a paper using a word processor and information from several sources, the user may have to perform hundreds of tedious actions, commands, selections, navigation operations, etc.
In this paradigm, certain conventions have evolved into place which only moderately simplify or reduce the burden of such operations. For example, performing a “cut” operation usually deletes the selected source content from the source file, and places a copy of it into the transfer buffer, sometimes overwriting the current contents of the transfer buffer. A “copy” operation typically leaves the selected information unchanged in the source and only places a copy of the information in the transfer buffer. Additionally, in the destination UI, a “paste” or “insert” command may copy the contents of transfer buffer to a selected point in the destination document or file, leaving a copy in the transfer buffer for additional pastes or insertions.
In some computer programs, a “paste special”, “import from clipboard”, or similar command may be available with several conversion options to perform a minimal conversion process on each transfer. However, even though the user may be performing the same “paste special” command over and over, the typical UI does not memorize or “learn” this process, so the user is forced to respond to a number of redundant options and dialogs on each paste operation.
The same user interface conventions are followed by many computer systems not only for content or information within a computer resource such as text and graphics within a file, but also for resources (e.g. files, shortcuts, icons, mappings, etc.) within a computing environment (e.g. file system, directories, folders, etc.). For example, when working with a MS Windows™ operating system and running the Windows Explorer program, a user may select a file, directory or folder to move, execute an “Edit—Cut” command sequence, navigate to another directory or drive, and execute an “Edit—Paste” command to move the selected resource to the new destination. Similarly, by selecting the source resource, executing a copy command, and then executing a paste command to one or more destinations, the original resource is not changed but copies of it are deposited at the destination points. Further, by selecting and copying a source resource, then selecting a destination resource, replacement of the destination resource may be accomplished.
However, as source information may or may not be completely compatible with destination environment, the user, even when using the related invention, may be required to make certain tedious and inconvenient operations to perform the information transfer usefully.
For example, consider a situation where the source information is a graphically rich section of a web page, including text and color photographs. Further assume for the purposes of this example that the color photographs are stored in the source document as Joint Photographic Experts Group (“JPEG”) format data objects. If the user attempts to transfer this selected information (e.g. text+photograph) to a destination which does not support JPEG photographs, such as a text-only editor, a problem may arise that cannot be completely handled by the related invention—e.g. what to do with the photograph. In some other cases, destination editor may be able to handle other format images, such as Graphic Interface File (“GIF”) images, but not JPEG images.
The user may, after realizing this and with sufficient technical expertise, find a way to export the image and save it to a separate file, use another tool to convert the JPEG image to a GIF image, and then use the related invention to transfer the GIF image to the destination document. This process, however, requires the user to have the expertise necessary to make such an export and conversion, as well as to have the extra tool to perform the conversion. Further, if this is a task to be repeated often, such as cutting and pasting a considerable amount of information in this manner, it becomes tedious, tiresome, and error-prone.
Additionally, in some situations, it is desirable to translate the initial spoken language of the source to another spoken language. For example, if a user is creating a Spanish-language newspaper and desires to quote an source which is provided in English, the present day systems require the user to perform all of the cut-and-paste steps as just described, as well as to some how perform a language translation on the English source text. The translation can be performed manually, or the user may have to retype the English text into a translation program, select an option to translate the text to Spanish, and then retype or cut-and-paste the Spanish text into the desired destination file or document. To achieve a computer-assisted translation such as this, the user may use one of many well-known programs or online services, such as AltaVista's online BabelFish™ phrase translator. However, to use these existing tools, the user must treat the translation program (or web site) as yet another source and/or destination, and must perform many user operations to effect cutting and pasting from source to translator, and from translator to destination. If the user is translating several phrases several sources, and inserting the translated text into a plurality of destination points, this process can be very cumbersome and tedious.
Therefore, there existed a need in the art for a system and method to provide configurable automatic source-to-destination spoken language translation for transferring information from one or more source environments in one or more initial spoken language to one or more destination environments in a selected spoken language.
One of the related patent applications disclosed a solution for automatic translation of text during transfer between computer programs, but this related invention did not address embedded text or encoded text within a graphic file such as an image or icon.
For the purposes of this disclosure, we will define “encoded text” as text which is associated with a graphical object, but which is represented in a non-optical or non-visual form, such as the text which is part of common “vector graphics” images such as Windows Meta Files (“WMF”), PostScript™ data, Hewlett-Packard Graphic Language (“HPGL”)™, Adobe Portable Document Format (“PDF”)™, as well as various proprietary and other “open” graphics languages and formats. In each of these formats, some text strings are “encoded” by storing the text string along with descriptors regarding the font, size, color, effects (underline, bold, italics, etc.), and location of the text string. This text, with proper knowledge of the encoding format, can be extracted from the graphical object without the need for Optical Character Recognition.
For the purposes of this disclosure, we will define “embedded text” as text which is incorporated into a digital image, and which is not represented by “text data” (e.g. not encoded text), but is represented as digital images of text, such as bit maps which appear to be text to a user. Embedded text must be found within such a graphical object using human recognition, or through Optical Character Recognition techniques.
It is common vector graphics objects (PDF, WMF, HPGL, PostScript, etc.) to contain both embedded text and encoded text.
Often times, graphics on a web page or any source contains multiple embedded text phrases. By using an existing image editor, such as Adobe PhotoShop™, and a translation tool such as a translation dictionary, a user must perform several user operations to effect editing, cutting and pasting between source to editor, editor to translator, translator back to editor and finally editor to destination. In addition, if the user is translating multiple phrases several image sources, and inserting the translated text into a plurality of destination points, this process becomes very monotonous and tiresome.
For example, as shown in FIG. 8, a graphic file contains apparent text words within the image itself (80). The image file is typically digitized in various formats such as JPG, TIF, PNG, and BMP. These types of graphic files can be obtained from various sources such as web browsers, text editors or image editor programs. In this example image of a city library on a street corner, there are five locations or regions in which embedded text can be found.                (1) on the wall where there is a sign which denotes a “City Library” (81);        (2) on the wall where there is a sign showing a street address of “900 Main St.” (82);        (3) on a pole, a street sign which reads “Main St.” (83);        (4) on a left door which is marked “out” (84); and        (5) on a right door which is marked “in” (85).        
This embedded, text, however, is not actually text data in the sense that it is comprised of regions of pixels which appear to a viewer to be text. Thus, using the presently available technology, an user wishing to transfer this image from an English document to a French document, for example, would have to perform the following general actions, each of which involve multiple steps:                (a) manually examine the image to find all of the embedded text;        (b) for each embedded text word or phrase, perform a natural language translation, such as translating “Main St.” to “Rue Main”;        (c) open the image file or object in an image editor such as Adobe PhotoShop™;        (d) perform multiple operations to replace or overlay the embedded text areas with the translated text words and phrases;        (e) export or save the modified image to an image file or object; and        (f) insert the modified image file or object into the new document.        
These steps require several distinct skill sets, such as being capable of performing language translations and being capable of using an image editor program, and can be quite time consuming.
Similarly, to properly translate the encoded text portions of a graphical object, one would need to use an editor which is capable of editing the graphical object (e.g. a PostScript editor, a PDF editor, etc.), and would need to perform the NLS translation on the encoded text in order to produce a “translated” graphical object. Again, this requires expertise in several domains, including using the editor software as well as being literate in the two languages (e.g. the source NLS and the destination NLS).
Therefore, there is a need in the art for a system and method to provide configurable automatic source-to-destination spoken language translation for transferring information from one or more source environments in one or more initial spoken languages to one or more destination environments in a selected spoken language for embedded text and encoded text within graphic image files, elements and objects.