There are numerous different types of databases. One characteristic defining most such databases, however, is the use of a structured method of storing the data. A flat table type of database, for example, is typically stored as a data stream with markers that designate columns and rows. In display of such databases the rows typically correspond to records, and the columns correspond to fields.
Spreadsheets can be thought of as a species of flat table databases, in which the display of the database is used as the sole, or at least a primary, data entry interface. Indeed, it is known to readily exchange data between spreadsheets such as Microsoft™ Excel™ and databases such as Microsoft™ Access™. One difference is that spreadsheets typically include the column identification information in cells of the first row, whereas databases typically include the column identification information in a header that does not appear to a user to be mixed in with the data.
With heightened need to provide simplified connectivity among different platforms and different types of databases, a new form of database that relies on metatags to identify data rather than logical structure. The term “metatag” is used herein to mean an identifier of a type of data content that may be used repeatedly in a document to identify multiple occurrences of data having the particular type of data content. Thus, a metatag may be employed to identify data having data content relating to “price”, “description”, or “product number”, and may even use those exact words as the metatag names. An electronic copy of a letter to a customer may then be “tagged” by blocking a price with the letter, hitting a special activation code, and then typing the word “price”. Corresponding data within the letter can be tagged in a similar manner to identify description and product number. It is also known to nest the metatagged data, such as by blocking together the price, description, and product number, and identifying all three items of data with the metatag such as “product”. Microsoft™ XML™ is an exemplary proprietary language that employs metatags to store data. XML™ is described in numerous publications, including McLaughlin, Brett, Java & XML, 2nd Edition: Solutions to Real-World Problems, O'Reilly & Associates; September, 2001, ISBN: 0596001975, and Harold, Elliotte Rusty and W. Scott Means, Scott W., XML in a Nutshell: A Desktop Quick Reference (Nutshell Handbook), Jan. 15, 2001, O'Reilly & Associates; ISBN: 0596000588, both of which is incorporated by reference herein.
A major advantage of metatagged data (also referred to as “tagged” data for simplicity herein) is that such data can be properly identified within substantially any type of document, from an image file to a text file, regardless of the structure of the file. In such instances the tagged data items are readily interspersed among non-tagged data by delimiters that correlate a metatag with its associated data. Another advantage is that data storage space is not wasted on cells for which there is no data. In an ordinary flat database having 20 records and 7 fields, for example, a database system may allocate space for 20*7=140 cells. If only 50 of those cells have data, then 50% of the storage space is wasted. With a metatagged file, there are no unused cells because there are no cells at all.
Conversely, if the metatags are excessively repeated within a document, the use of metatags can also produce inefficiency. If there are 1000 occurrences of each of three data types, it would be much more efficient to store the data in a flat database having 1000 records of 3 fields than using a metatagged structure. The flat database would store the field names only once, but the metatagged structure would store one or another of the metatag names 3000 times.
There is also the problem of knowing which metatags can or should be used to tag data within a document. This is potentially an ongoing problem for those involved in metatagging documents, precisely because metatags they are not limited to a fixed set of names, and there are as yet no generally accepted metatag naming conventions. This is not typically a problem for other types of databases because the person setting up the data base already set a relatively fixed list of possible data fields, and those entering data are limited to those pre-set fields. Moreover, the literal naming of the fields is usually irrelevant to a typical user.
There is still the further problem that tagged data in metatagged files can be difficult to visualize. For example, spreadsheet data is very readily visualized in the well-known column and row format, in which each column effectively stores data for different types of data content. Not only can data in adjacent cells be visually compared, but mathematical functions on data in one row are readily ported to other rows, and mathematical functions on data in one column are readily ported to other columns. All of these things can be very difficult in metatagged documents because data for any given type of content (data that would be listed in the same column of a spreadsheet), can be located all over a document.
It is thus interesting that spreadsheets and metatagged files have advantages and disadvantages that are to a large extent complimentary. This fact either does not seem to have been appreciated by others, or they have not developed a solution to take proper advantage of such complimentarity. Thus, there is a need for methods and embodiments that advantageously combine features of spreadsheets for use with metatags.