For decades, personal computers have stored information using a hierarchical file/folder metaphor, in which pieces of information, or files, have been categorized through their placement in folders (also referred to as “directories”) with concise names. Some of the earliest computer operating systems that have long since fallen out of use, such as CP/M, used names for files and folders alike that were restricted to a syntactical format involving at most eight characters, followed by a period (or “dot”), followed by at most three more characters (the “extension”) intended to represent the type of the file. For example, a file called COMMON.DOC was (and to some extent still is) often referred to as a having a “.DOC” file type, somewhat opaquely indicating that it is intended to be used by Microsoft Word. With this conventional, limiting and hierarchical system, unlike a traditional filing cabinet, it is nonetheless possible to store a folder within another folder recursively essentially ad infinitem. Yet many limitations exist: among them, the fact that it is not possible to store the exact same file in two or more “folders” at once, which might be helpful for categorization purposes, without making a duplicate, asynchronous copy of that file.
The problem of addressing files has traditionally been handled through the use of special characters to denote logical relationships. On UNIX operating systems, the forward-slash character (“/”) denotes that the subsequent string of characters is a file or folder that is a subset of the folder or storage device corresponding to the preceding string of characters. A typical UNIX file path therefore might be “/etc/httpd/conf/httpd.conf”, with / representing the root node or storage device. On MS-DOS and Windows systems, this character has traditionally been a backward-slash (“\”), giving rise to file paths such as “C:\WINDOWS\SYSTEM32\KERNEL32.DLL”, with C:\ representing the root node or storage device. Apple Macintosh systems prior to Mac OS X used two sets of colons (“::”), with file paths such as “Macintosh HD::Documents::My Letter.doc”. Mac OS X, built on the FreeBSD UNIX kernel, now uses UNIX file paths.
The implication that digital information must obey physical laws such as conservation of matter—meaning that a file, like a chair, might only be in one place or another at any given point in time—has had unintended consequences that have afflicted millions of computer users. Unintentionally renaming or moving a “system file,” i.e. a file integral to the proper functioning of core system software, might render a computer completely inoperable and require hours of maintenance to fix. Another consequence of mutually exclusive file addressing involves naming conventions: naming a revision of a legal document pertaining to multiple issues might require an awkward and overly verbose file name such as “March-31-client-X-Johns-sales-revision-3-with-changes-about-cameras.docx” so that it is likely to appear in future searches for data pertaining to sales figures, John, and cameras. In other words, crucial system files are not identified to operating systems by their crucial nature—but rather by an arbitrary string that is hard coded into a piece of software corresponding to a particular location—nor are user files identifiable to users by the definition of any particular attributes except clumsy names. Yet another consequence of the over-dependence upon special file names is that by changing the data at such a hard-coded address (i.e. file name), in cases where a computer does not stop functioning outright, it is also possible to insert malicious software.
“Tags,” also sometimes referred to as “labels,” have recently gained a foothold on the internet as an improved method of categorizing e-mail messages, blog posts, and other disparate types of information stored on remote servers. Nonetheless, tags are not found at the lower operating system or file system levels, where files accessible through an operating system kernel are organized on a physical data storage medium, or disk. Nor are tags found within one program or web site typically usable on another. They exist for all practical purposes only in independent silos.
The organization of large numbers of files on a physical storage medium presents challenges that have serious implications for software development and use. For example, many computer software applications make use of dynamic linked libraries, or collections of commonly-used software functions, that can undergo numerous revisions by programmers throughout the lifespan of a given computer system. Some libraries, which frequently have multiple versions, must also be offered in 32-bit and 64-bit editions simultaneously to account for different types of microprocessor architectures available in modern computers. File paths are the sole identifier that modern filesystems use to distinguish between types of files, and only one file with a given name can exist in a given folder. Though each revision and type of software library must be treated as completely distinct in order for other inter-linked programs to function properly, confusion frequently reigns (often causing software crashes) since completely distinct versions and/or editions of libraries can have confusingly similar, and sometimes identical, file names.
Compounding the failure to distinguish between microprocessor platforms and revision numbers, modern filesystems are unable to quickly find files made by the same person or employer, let alone those that meet more abstract search criteria, such as files made by any company located in the greater London area. The simple fact that there are only two possible types of data (namely, files and folders) on popular filesystems ranging from FAT32 to NTFS to ZFS to HFS to CDFS explains why such limitations persist even on what are otherwise considered to be “cutting-edge” filesystems.
The application layer that associates certain file name suffixes with specific programs is itself extremely error-prone and confusing. For example, OpenOffice, the open-source equivalent to Microsoft Office, can read and edit .DOC, .XLS, and .PPT files—extensions that originated with Microsoft software—but if OpenOffice is installed after Microsoft Office, it may re-assign itself to those file types, leaving Microsoft out of the equation on that particular computer. While this may be desired behavior in some instances, it may also be highly undesired at times. Today, there is no concept of a word processing document according to an operating system; only that of a .DOC extension that must either correspond to one software application or another.
Using traditional file system hierarchies, files and folders also fail to persist across multiple computer devices, including multiple devices that belong to the same individual. In general, this means that information cannot easily be made available to groups of people without using a software application running on top of the operating system as an intermediary distribution mechanism (e.g. Microsoft SharePoint, Lotus Notes, web browsers, etc.) For a given individual, this major drawback of today's operating system software also necessitates regular data transfer and file/folder synchronization, which is increasingly time-consuming and error-prone as more devices become involved.
Tags fill another gap left by traditional file systems, involving data that does not need (from the perspective of an average user) a proper name, such as a particular photograph in a set of many family photographs taken in rapid succession. Photographs stored on traditional storage media are usually assigned meaningless names such as DSC—1398.JPG, which convey no meaning to the photographer or viewer whatsoever, aside from the largely irrelevant fact that the photograph is the 1,398th image to be captured by that camera since its counter was last reset.
A need exists for a new type of operating system that can better organize real-world data, e.g. contact records, events, financial documents, medical records, e-mail messages, personal letters, legal notices, software code, photographs, video, sound, music, and abstract data sets, among others, and then relate those pieces of data to each other through the use of tags. Until such a system exists in the commercial marketplace, there will also be a need for a transitional technology that moves large volumes of data from existent operating systems—those based upon arbitrary, independent hierarchies of folders—to new systems that employ intuitive labels not required to be mutually exclusive, and that can easily persist across devices and applications.
Tags have often been used on web sites. However, they have not been employed as the main channel of accessing information in any major operating system, let alone a channel that makes inherent use of the internet, due to several limitations in the existing implementations, which are addressed by this invention.
A need further exists for a new method and system for associating data with tools, such as software applications, that are capable of working with that data. For example, while it is easily possible to write an essay without the use of a computer by using a pencil on the left side of a piece of 8.5″×11″ paper, and then to paint a picture with a paintbrush and colored paints on the right side of that same page, accomplishing a similar feat through the use of a computer is surprisingly far more difficult. The latter process is a cumbersome exercise involving the creation of two to three files in two to three different software applications: the essay, in a word processor; the painting, in a drawing program; and the “page” itself, in a desktop publishing application capable of uniting the other files according to particular layout specifications. Therefore, while the technology employed for data storage and organization today is extremely functional and widespread, it still lacks the ability to quickly assemble and categorize information, and later, to help the user find that information. There is therefore a need in the operating system field, and the file system field specifically, to create a new and useful method and system for storing, categorizing and distributing relationships between data. This invention provides such a new and useful method and system.