This invention relates to the field of network analysis tools and other tools that use a relational database to store and retrieve hierarchical data in a generic and schema agnostic fashion.
The effective management of large networks requires the use of tools that use one or more models of the network and each component within the network. To model the performance of a network, simulators and other tools effectively emulate the behavior of each of the devices in the network to determine how the system will perform as message traffic is propagated across the network, or to facilitate the diagnosis of the network as problems are reported. To properly emulate the behavior of each device, the configuration parameters, default settings, and so on associated with each device must be captured and included within the models, resulting in a very large collection of data being required even for relatively small networks.
The data that is captured and used to build network models is usually richly typed and hierarchical in nature. Storing such a network model in a relational database becomes an ongoing challenge in any implementation. In some implementations, the base software for data collection and management represent this richly typed hierarchical data in correspondingly complex database schemas. A disadvantage of such an approach, however, becomes apparent when new network routing and switching protocols, configurations, new device types and characteristics are invented and supported by different network equipment manufacturers. Often, a new capability or characteristic will require a change to the structure of the database schema, and, depending upon the robustness of the original structure, may reveal inherent incompatibilities with the new features and require a substantial restructuring. Even when the schema is robust, the changing and increasingly complex information in an actual network generally requires a continual update of the database schema and its table structure for each new release of the modeling software.
Techniques have been developed that use a composite data structure that allows for scalable and schema-agnostic storage and retrieval of richly typed hierarchical data. The hierarchical data is stored and retrieved using a single data structure, thereby allowing the software implementation and the database schema to be substantially immune to the necessity of remodeling increasing richness and complexity in the actual network data. Of particular note, the use of a single data structure allows the software designer to optimize read and write access to the database, and avoids having to reassess and/or rewrite the code for future releases.
A generic way of representing hierarchical data is via the use of a Composite Pattern and Binary Large Object (BLOB) data types.
As illustrated in FIG. 1, the basic unit of the data structure described in the Composite Pattern 100 consists of a Component class 110, a Composite class 120, and a Leaf class 130, wherein the Leaf and Composite classes are subclasses of the Component class. An instance of the Composite class may contain instances of other Components, while an instance of the Leaf class cannot contain instances of Component classes. Because each Composite 120 of a Component 110 may contain one or more (lower-level) Components, the Composite Pattern 100 is well suited as a basic building block that is capable of representing any hierarchical data model of any depth. Since most information in a network model is richly typed and hierarchical in nature, the Composite Pattern 100 is often used to represent this data for use by application programs 180.
Most modern relational database implementations support a relational data structure called the Binary Large Object (or BLOB). A BLOB is a built-in data type that stores the data in the BLOB in a single cell of a database table. For all intents and purposes programmatic access to the BLOB is akin to accessing an integer or string value in a cell of the database table. BLOB data structures are typically used to store large amounts of data such as images or documents. BLOBs can be used to store data structures from programming languages as long as there is a way to convert the programming language entity to and from the binary data stream. The conversion of programming language data structure instances into binary byte streams is called serialization and the reverse process is called de-serialization. To model complex networks, the BLOB data structure is used to capture the hierarchical data built using the Composite Pattern. That is, an instance of the Composite Pattern is converted to a sequence of bytes (serializing the BLOB) that can then be stored as-is in the BLOB data type in a single cell of a database table. On retrieval, the sequence of bytes is transformed back (deserializing the BLOB) into the Composite Pattern for programmatic access.
The combination of the Composite Pattern and the BLOB data type allows for a schema agnostic representation of complex and hierarchical data; that is, the actual/internal schema is hidden within the BLOB, and the hierarchy is hidden by the generic nature of the Composite Pattern. As the complexity of the network model grows, there is no need to change the database schema. New data, like the older data, is represented using the Composite Pattern and stored in the relational database as a BLOB. Changing the data model within a single instance of the Composite Pattern or adding additional depth or pruning the hierarchy has no effect on the database schema. Furthermore, the data storage and retrieval of BLOB data values can be made efficient, because there is only a single data structure that needs to be addressed for database read and write access.
Although the combination of the Composite Pattern and the BLOB data type allows for the schema agnostic representation of complex and hierarchical data, and allows for changing the elements within the data model, or the hierarchy of the data model, without changing the database schema, this generic storage of hierarchical data has several disadvantages when used to store large and complex network models.
A particular disadvantage of this combination of the Composite Pattern and the BLOB data type for storing network models relates to accessing individual elements within the model. Because the BLOB data type masks the internal schema details, and the Composite Pattern is designed to be generic, and masks the actual network hierarchy, access to the elements within the hierarchy requires a top-down deserialization of the network model until the desired element is found. Correspondingly, a top-down serialization of the entire network model is required to subsequently store the Composite Pattern corresponding to the model. That is, for each access to add, remove or modify the contents of a BLOB, the application layer must read the entire Composite-BLOB data structure into memory, make the change to the particular item, then write the entire Composite-BLOB data structure back to persistent storage. This can prove to be highly CPU and memory inefficient for the large data volumes associated with network models.
In addition to the inherent inefficiency of having to deserialize and serialize the entire network model to access and modify a single network element, the inefficiency of this process is further compounded by the fact that conventional BLOB deserialization processes are generally not well suited to efficiently deserialize ‘unpredictable’ data structures. In conventional systems, because the BLOB hides the internal details, the deserialization process creates the structure as new components are deserialized. A conventional serialization process will write out each lower level component as it comes across it. During deserialization, each newly encountered component is added to the composite of its parent. This addition includes, for example, adding a ‘child’ record to a list of children, wherein the child record includes a reference to the location of the deserialized component within the memory and optionally other characteristics of the deserialized component. Typically, a default block of memory is allocated for storing the child records at the parent. As new children are encountered, this block of memory may be insufficiently sized to accommodate the newest child, and an additional block of memory must be allocated. However, because the table of child references is preferably stored as a contiguous block, for efficient indexing and processing, the addition of a new block of contiguous memory often requires a shift/restructure of previously allocated memory to accommodate this larger block. Additionally, because the block size is generally large, to reduce the required number of reallocations, substantial memory space can be wasted at each composite structure, when a new block is allocated but not filled by the remaining data.
It is an objective of this invention to facilitate searches through the network hierarchy that is conventionally masked by the use of a Composite Pattern with BLOB data types. It is a further objective of this invention to allow client applications to modify individual elements within a network model without requiring a deserialization of the entire BLOB hierarchy that is used to store the network model. It is a further objective of this invention to minimize the memory allocation overhead associated with the deserialization of hierarchical BLOBs.
These objectives, and others, are achieved by providing a Composite Pattern with BLOB data types that includes a path-like construct for locating each component within a network model. Each node in the hierarchy is uniquely identified by a name that is unique within the context of its parent node. The node is also uniquely identified within the entire hierarchy by its “path”, using, for example, a concatenation of the names of the nodes in its ancestry. Database stored procedures are used to efficiently search, modify and retrieve individual nodes from the BLOB using the database server's memory pool so that client applications are not required to retrieve and deserialize the entire Composite-BLOB hierarchy in order to make modifications or search for individual elements, thereby substantially reducing the transfer of data between the application layer and database. To avoid the need for dynamic memory allocation during deserialization, the size of each pre-serialized composite is stored when the composite is serialized, and during deserialization, the size is retrieved and used to obtain sufficient memory for the deserialized composite via a single memory allocation.
Throughout the drawings, the same reference numerals indicate similar or corresponding features or functions. The drawings are included for illustrative purposes and are not intended to limit the scope of the invention.