Many data processing applications require the storage in memory of variable length data. For example, communications messages between computers are often sent in accordance with multiple layer communications protocols. For example, the Open Systems Interconnection (OSI) model divides a communications process into seven layers in a protocol "stack." Certain communication tasks are assigned to certain ones of the layers, and the output of each layer conforms to a precise format. Data from an application or process running on a first host computer passes down through each OSI layer in the protocol stack on its way to the communications network. As the information descends through each layer, it undergoes a transformation that prepares it for processing by the next layer. Upon reaching the bottom layer, data is transferred over the physical medium of a communications network as an electrical signal to a receiving computer which then processes the message in reverse order up the protocol stack through the seven OSI layers.
Accordingly, the layer protocols and interfaces therebetween specify communication between a process or program executed on one host computer's operating system and another process or program running on another computer's operating system. One such protocol is the Signaling System Number 7 (SS7) developed to meet the advanced signaling requirements of digital networks.
FIG. 1 shows an SS7 protocol stack for two host computers A and B. The SS7 model in FIG. 1 shows functional levels, sometimes referred to as TTC/ITU-T levels, alongside the traditional OSI, seven-layer protocol model. In general, OSI layers 1-3 comprise functions for the transportation of information from one location to another. The message transfer part (MTP) and the signaling connection control part (SCCP) are examples of SS7 modules which perform the OSI layer services 1-3.
OSI layers 4-7 define functions related to end-to-end communication. These layers are independent of the internal structure of the communications network. Transaction capabilities (TC) and/or user parts (UP) provide OSI layer 4-7 services. If OSI layer 7 represents the semantics of a communication, then OSI layers 1-6 are the means by which that communication is realized. End user application entities provide application layer protocols in the OSI layer number 7.
Since signaling system number 7 is used to transmit information between different "users," for example telephony users or integrated service digital network (ISDN) users, its functions are divided into a number of "user parts" (UP) as shown in FIG. 1. TCAP stands for transaction capabilities part, ISUP for ISDN user part, and TUP for telephony user part, among many others. Each of these user parts process signal information before and after transmission to the signaling network.
The message transfer part (MTP) reliably transports and delivers user part information across the signaling system number 7 network. The MTP also reacts to system and network failures that affect the information from the user parts and takes the necessary action to ensure that the information is safely conveyed. As shown in FIG. 1, the MTP is divided into three functional levels L1, L2, and L3. The MTP-L 1 defines the physical, electrical, and functional characteristics of a signaling data link and the means to access it. The signaling link is sometimes referred to as a "bearer" and is a bidirectional transmission path for signaling messages between two signaling points. The MTP-L2 defines the functions and procedures for and relating to the transfer of signaling messages over one individual signaling link, including various ancillary functions, such as error detection, framing, bit-stuffing, etc. The MTP-L3 handles signaling network functions including message routing, discrimination, and distribution as well as signaling network management functions.
A signaling message is transferred over the signaling link in the form of signaling units. In the basis SS7 model, there are three types of signaling units differentiated by means of a length indicator. Message signal units (MSUs) carry information generated by a user part. The MSUs are passed from SS7 module to module, down through the MTP layers to the link and to the next node, where they follow the same path up to the MTP layers and are finally delivered to the opposite user part. If an error is detected in an MSU at the receiving MTP-L2, reception is not acknowledged, and the MSU is retransmitted. Link status signal units (LSSU) and fill in signal units (FISU) are used by MTP-L2 to exchange control information. The LSSU is used for starting up a signaling link and for handling errors in the link. The FISU is used to keep the link running when there are no MSUs to be sent. All three of these signals will contain parameters used to acknowledge (ACK) or reject (NACK) MSU signals at the transmitting MTP-L2 when a received MSU is found to be correct or in error, respectively, when examined at the receiving node's MTP-L2.
Each host computer connected to the network includes data processing hardware and software for generating and formatting messages down the protocol stack for transmission on a link and up a protocol stack for delivery to an end application. For example, the local signaling controller hardware may include a SUN (SUN is believed to be a registered trademark of Sun Microsystems, Inc.) workstation that employs multiple processors connected via appropriate buses to random access memory (RAM), read-only memory (ROM), and magnetic storage media. The RAM main memory is used for storing temporary variables, messages, or other intermediate information including execution of instructions by the processors. ROM and/or other static storage devices store static information and instructions for the processors. The magnetic storage media may also store information instructions as well as various files.
Certain areas of the memory are occupied by processes, buffers, data streams, peripheral device driver software, and an operating system, e.g., the UNIX operating system (OS). The stored information may be generally divided into executable code and into data (e.g., data packets) both of which occupy separate and well defined areas of the memory. Since many of the code and data are dynamic in the sense that their size is very often changing, a major task of the operating system is to ensure that no entity attempts to use memory being used by another entity. In essence, this is the operating system's memory management function. Because the operating system must serve an enormous range of different memory functions, the operating system's memory service functions must be very flexible and generic. Unfortunately, it is these very qualities which makes the operating system memory management slow and inefficient.
As part of the generic memory management function, the operating system takes blocks of memory from its memory pool and assigns each memory block to a different task typically using some proprietary strategy. Until that task is terminated in some way, the assigned memory block may not be used by any other task. Thus, the operating system may assign a memory area for itself, another for interrupt processes, a third for input/output buffers, and another for an operating system user such as an SS7 process. Still other memory areas may be dynamically allocated at the request of the operating system or the operating system user's processes.
Dynamic allocation of memory is performed by operating system functions which process memory requests that includes memory size specifications and return memory address pointers that provide the address of a first byte or octet of the assigned memory area. When the assigned memory is no longer needed, it is returned to the memory pool.
As already mentioned, this dynamic allocation and return of memory is quite cumbersome and slow. Consider the example situation in an SS7 signaling stack where the length of a message varies as it moves up and down the stack. For a message moving down the stack, the new data to be added to the message is typically a header appended to the beginning of the message. The new header is stored in a new memory area, and thereafter, the old message is copied into the memory after that new header. After the copying is completed, the old memory area is returned to the operating system. When the next process, here the next level in the stack, wants to add a new header, the same procedure is performed: the new header is stored in a new memory area, the message is copied into that new memory area after the newly stored header, a new memory area pointer is provided to the next process, and the old memory area pointer pointing to the old memory location where the message was just previously stored is returned to the operating system. A similar process is performed going up the protocol stack.
Because the messages have variable length with the new messages usually longer going down the stack and shorter going up the stack, copying operations are performed at least once at each process. Consequently, there is quite a bit of copying in any protocol stack. The ISUP, for example, contains over twenty processes.
As a result of storing and copying these different length messages, various memory areas dynamically allocated by the operating system end up being separated by small "free" memory areas, none of which can be used because it is too small. This result is referred to as "memory fragmentation."
One manner of dealing with memory fragmentation is for the operating system to "pack" the used memory sections together in one area of the memory. But to do this, the operating system must maintain a look-up table of pointers which match old pointers with new pointers so that when a process asks for a particular piece of data using the old pointer, that processor receives the correct data using the corresponding new pointer. This "packing" of data in memory is referred to as "garbage collection" and is a sufficiently burdensome task that the operating system must often stop serving its application processes while it repacks the memory. In real time type applications such as telecommunications, this total lack of access while the data is being rearranged and substantial delays caused when data is being accessed due to the pointer translation are very often unacceptable.
Another memory management problem is the lack of memory. One way in which the operating system may try to solve the problem of insufficient memory is to move data associated with a first task away to a disk or other memory to free up that area for use by a second task. When the first task needs its data, the data is "swapped." Such swapping adds delays to data access operations in addition to the more basic problem of not being able to access the data when it is needed.
Yet another memory allocation and management problem relates to the architecture of the central processing unit(s) and the memory access bus. For example, memory may need to be addressed in blocks of two, four, or eight octets due to the bit width of the CPU or the bit width of the access bus. Otherwise there is a bit "misalignment" of the accessed data structure from memory and the hardware that processes the accessed data.
For efficiency, it is important that the blocks of data accessed from memory be bit-aligned with the CPU and access bus hardware. In other words, each block of access memory should begin with an address which is an integral multiple of the bit width of the CPU and access bus. This is because the CPU fetches data from memory from memory addresses which are integral multiples of that bus bit width. If the bus is 32 bits wide (four octets) and a four octet message is stored in memory at the beginning of a second octet, two bus cycles (rather than one) are needed to access that message. In addition, that "misaligned" message accessed in two bus cycles must be rebuilt by the CPU which may take six or more CPU cycles. On the other hand, if this four octet message is aligned to begin at an integral multiple of the bus width of 32 bits, e.g., 32, 64, 96, etc., only one bus cycle and one CPU cycle are necessary to access the message.
Different approaches have been proposed for handling variable length messages. For example, U.S. Pat. No. 5,367,643 to Chang et al. discloses a generic high bandwidth adapter that includes a data packet memory used to temporarily store variable length data packets. The packet memory is segmented into a plurality of buffers, and each data packet is stored in one or more buffers as required by the length of that packet. A generic adaptive manager organizes data packets in the buffers and into queues for further processing or data transfer. In particular, the packet memory is segmented into a set of 256-byte buffers. Packets are stored in one or more of these buffers and deleted or routed as appropriate. However, Chang's buffers all have a uniform size. U.S. Pat. No. 5,426,424 to Vanden Heuvel et al. discloses a memory manager that allocates and deallocates memory blocks. Rather than allocating a minimum block of memory much larger than an actual received message, Vanden Heuvel allocates memory blocks in either contiguous or non-contiguous fashion. Like Chang, Vanden Heuvel's data blocks are of all the same size.
What is needed is a new memory allocation and management approach that overcomes the drawbacks described above and provides faster and more efficient memory allocation and management.
It is therefore an object of the present invention to overcome the above-described problems.
It is an object of the present invention to off-load certain memory access and management functions from the operating system and perform those functions more efficiently.
It is a further object of the present invention to provide a memory structure and store data in that memory structure so that various delays and inefficiencies caused by memory fragmentation/garbage collection, lack of memory, data swapping and misalignment with data processing hardware are avoided.
These and other objects are met by the present invention which provides for optimized memory management to efficiently and dynamically allocate memory. A memory manager (separate from the operating system) requests a large area of memory from the operating system. From the viewpoint of the operating system, that large area of memory is a fixed memory area neither moved nor altered. The memory manager divides that fixed memory area into an integral number of classes or groups. Each memory class includes same-size blocks of memory linked together by pointers. The memory block sizes are different for each class, and the sizes are selected to conform to and align with CPU(s) and memory access bus architecture as well as to accommodate the various sizes of data expected to be processed for a particular application. For example, in the SS7 application described above, there are smaller messages or message segments such as alarms and message headers and longer messages or message segments such as MSUs.
Each memory block in a class or group has the same number of octets. The memory manager maintains a separate linked list of unused blocks within each class/group to ensure each memory block is zeroed initially and zeroed after release by a process previously assigned to it. When a block of memory is loaned to a particular process, one or more bits are set to indicate that it is in use. Ultimately, the memory manager monitors how many blocks are in each class, how many of these blocks are loaned out, and how many are available. Such monitoring is useful not only in memory allocation but also in debugging errors.
Incoming messages are analyzed and parsed based upon definitions of message structures to be received in a particular application. Messages and segments of messages are stored in the appropriate size memory blocks from memory block classes best configured for the size of the message/message segment. For example, a message header may be stored in one class memory block and the message data stored into another class memory block linked to the first block. The more closely that the memory blocks class sizes match the length of messages being processed, the more efficiently memory is allocated and accessed.
Once the memory blocks are defined in each class, those blocks never move. Moreover, data is written into memory only once for each message. Instead of moving a memory block, a memory pointer to that fixed block is passed to another requesting process. By giving pointers to other processes, the stored information is effectively passed to the other processes through the pointers. Because each of the memory blocks is and remains aligned with the CPU and bus structure from its initial creation, the data stored and accessed from each memory block is aligned.