A. In General
Complex forms of data are emerging in many application domains, notably in multimedia (for example, image, video, and audio), but also in specialized application areas such as medical care (X-Rays, MRI imaging, EKG traces), geographical systems (maps, seismic data, satellite images), and finance (time series data). The new forms of data are sometimes referred to as "unstructured" data. In reality, the new data types are distinguished from traditional scalar numeric and character data by their multiple attributes, complex internal structure, and specialized behavior. For example, a video clip, as perceived by an application program, is not just a large binary field, it has attributes such as duration, format, number of frames, subject description, and information relating to legal ownership. It may also have associated index information, allowing search, positioning, and retrieval based on the content of individual frames. The video clip may have unique functions that could be used by an application program to retrieve selected portions of the clip, search for subject matter based on a combination of description and frame content, display, zoom, and edit the content, and return the clip to storage. These functions are clearly specific to the video clip data type and it would make no sense to perform them on some other data types, such as a static image.
Complex data types need to be stored in corporate databases, either in support of new applications, or to extend existing applications. In many cases, an application will require more than one new complex type in the same database or table. For example, a consolidated health care patient records system would probably require a database to contain many of the specialized medical data types mentioned above, as well as traditional data describing personal information, insurance details, etc.
In the database, complex data types must be capable of being searched, accessed and manipulated through the standard SQL language, without breaking the familiar table paradigm. Moreover, SQL support must be provided for the advanced technologies that are evolving to perform searches on the contents of large objects ("BLOBs"). Text search is a well known example, but algorithms also exist today for identifying images based on color, texture and shapes (for example, the Query By Image Content (QBIC) and Ultimedia Query (UMQ)), and the pattern matching techniques applied by IBM and Excalibur Technologies Corporation to fingerprint searching in the DB2.TM. Fingerprint Extender. Research is also proceeding into query by content of video and audio data. It must be possible to perform combined database queries that specify such content search criteria as well as criteria for traditional data search.
When the need for a complex data type is encountered in an application development project, a user might consider implementing the new type as an integrated part of the application. The attributes and structure of the data type might be stored in a special table; the handling of this table, and the special operations for the data type might be built into the application itself. Isochronous data such as video clips might be stored on a file server and the file names managed by the application program. However, such an approach has considerable disadvantages. The custom design and implementation of support for complex data every time it arises in an application would result in a large redundancy of effort and would add cost to advanced development projects. Also, the complexity of applications would increase because of the details of the data types that are visible in the application. Complex data types tend to require specialized skills for their definition and use, and by creating them as an integral part of an application, those skills will be needed in the application development team. The overall effect (and probably the most important disadvantage) is that a "roll-your-own" approach to handling these data types will ultimately slow the rate at which new advanced functions can be incorporated into corporate applications. The situation would be analogous to that which existed perhaps 30 or 40 years ago, when users tended to develop their own customized DBMS.
What is needed is a means of separating the specification and implementation of the structure and behavior of new complex data types from the development of the client applications that use them. An open architecture is required that allows the users themselves to create new data types, for use in many applications. The architecture should allow the installation of a new data type in a data base environment as needed, and independently of database product releases. It should mask the complexity of the data types from the applications. Effectively, it should bring complex data types under the umbrella of the data base system for all purposes of application access, security, and administration. In this way, user application development productivity may be improved, development complexity reduced, and the delivery of advanced function accelerated.
The challenge is to provide this open architecture that enables the development of specialized data types by users. The architecture must address the need to integrate the new data function seamlessly with existing RDB functions. The strategy must encourage the commercial development of new relational data types of broad applicability, both across industries, and within specialized industry domains.
This invention comprises Relational Extenders which are an architecture, a set of products designed to help RDB users handle emerging new complex data types in advanced applications and improve application development productivity, and reduce development complexity. Relational Extenders define and implement new complex data types in RDBs. The Relational Extender paradigm is essentially to `extend corporate relational data with new data types`. Relational Extenders encapsulate the attributes, structure and behavior of new data types and store them in a column of a RDB table, such that they can be processed through the SQL language as natural additions to the standard set of RDB data types. Relational Extenders are separate from the database system itself, in the sense that they can be developed, installed and used independently of full database product releases, but they are also part of the database in that they appear to application developers as seamless extensions to SQL and the RDB product.
B. The Impact of Multimedia
Deployment of multimedia applications in the commercial environment is in its beginning stages. Today it is primarily deployed stand-alone on isolated work stations, but it is expanding to departmental LAN environments. In the 1994-1995 timeframe, customers in the commercial market are actively piloting departmental multimedia solutions with local or remote client access. As the LAN solutions mature, then, will evolve to enterprise LAN solutions in the 1996-1997 time frame.
Studies show that multimedia deployment in the commercial environment falls into five general application classes: (1) Presentations/Training/Kiosks, (2) Multimedia extensions to corporate data, (3) Query applications, (4) Commercial video on demand, and (5) Document management. Data management plays a role in all of these application areas because the multimedia data needs to be organized, preserved, managed, updated, searched and delivered to a potentially large number of users.
1. Presentations, Training, Kiosk:
This class encompasses applications which are normally developed with authoring tools. Some examples are: corporate dissemination of information to employees, travel agency type desktop sales demonstrations, and factory floor training modules. To date, the majority of multimedia applications are in this class and execute on stand-alone workstations; however, customers recognize the value of moving to client-server configurations for greater flexibility and functionality and to better leverage the use of multimedia.
2. Multimedia Extensions to Corporate Data:
This class consists of operational applications which utilize traditional corporate data in RDBs in combination with unstructured data (text, image, video, audio, etc.). These applications will initially be extensions of existing applications, but the ability to link corporate data with multimedia data will open up new application areas. One example of this application class would be a health care provider that wants to consolidate patient information, including image based test results and audio annotations by physicians, in a RDB. Another example is the addition of images or video clips to a retail catalog sales application for interactive shopping. Most commercial environments have traditional database applications today which can be extended with multimedia content in order to interact better with application users or consumers.
3. Query Applications:
Query applications are a subset of the other application classes. This class encompasses the case where the data being queried is a combination of traditional data and multimedia data. Examples include ad-hoc queries for decision support in retail buying or stock market analysis. But query may be used to determine what training modules are available or to find appropriate images to incorporate into a new application.
4. Commercial Video on Demand:
Video on demand (VOD) can be divided into two market categories: the commercial market and the consumer market. For purposes of this proposal, we are only addressing video on demand in the commercial market. In this market, end users access videos via a workstation at their desk, rather than a settop box and television. An example of commercial VOD is the dissemination of news clips to brokers in offices around a wide geographic area. Commercial VOD applications will be deployed in cases where the ability to deliver current information quickly is essential or where the quantity of information is so large that duplication of the information is impractical.
5. Document Management:
Document management applications are used in businesses where the requirement is to manage documents and records as images of the hardcopy originals and to group them by customer. A typical user is an insurance company which uses document management application to process policies and claims.
Data management plays an important role in the application areas described above. As collections of multimedia objects become large, efficient data management capabilities are required to organize, preserve, manage, update, search and deliver these objects to a potentially large number of users. So customers need ways of managing multimedia objects as business assets. Multimedia encompasses a wide range of data types which have not previously been associated with corporate data. Providing additional data types in the database makes it easier to relate multimedia data to traditional data. When multimedia data is managed by the database, customers can efficiently search on the attributes of the data, but new technology for searching the content of the multimedia objects can be deployed via the database too.