This invention is related to data processing Systems and their architecture. In one aspect, it relates to a network component for retransmitting data packets in accordance with ID codes embedded therein in a distributed manner.
The classification and management of data is one of the most difficult tasks faced by corporations, government entities, and other large users of information. Companies must classify their data in such a way to make it easy and simple for buyers to find and purchase their products. Data exchanges face a bigger challenge in that they must work with multiple companies and develop a comprehensive classification system for their buyers.
One common way to create a search/classification system for specific products is to access and use government and/or industry specific classification systems (i.e., classification databases). However, no existing classification database is comprehensive enough to address all the issues associated with building a classification system. These issues include: uniform numbers for products that cross multiple industries, restricting products from inclusion in classification, and non-usage of slang or industry standard language to access or classify products. The classification databases frequently do not address all the products, thus resulting in inconsistencies even when companies use the same classification system.
Additionally, many of the various classification systems conflict with each other. For example, a product might have several classification numbers if it crosses multiple industries. Still other companies might use third party classification systems approved by a governmental entity. This program requires companies to pay multiple fees and go through a lengthy administrative process. Even then it may not cover all products in an industry. Companies must make a conscious decision to initiate, implement and maintain these programs. These efforts can be costly, and for this reason, compliance is generally not high.
A need therefore, exists, for a data processing system which automatically generates identification codes for specific products. Preferably, companies could use the automatically-generated identification codes in place of their existing identification codes. More preferably, the use of the automatically-generated identification codes can be phased-in gradually as the of user base expands.
Under current practices, companies create search engines by developing hierarchies and families of products. They may create a thesaurus to encompass slang words. Companies often use drop down menus, key words and product description capabilities to enhance their systems. It is desired to classify the data in such a way as to minimize the responses generated by a search, and therefore more effectively guide the buyer through the system. However, under current practices, most exchanges offer barely adequate search capabilities for their buyers. Buyers must click through numerous drop down menus and then sort through multiple entries to accomplish their objectives. In many instances the buyer will fail to find the product that they seek. These existing processes could therefore be characterized as cumbersome, time consuming, frustrating and ineffective. A need therefore exists, for a product classification system which can facilitate simple, rapid and effective searching by prospective buyers.
Another challenging data management task is the transmission of data between dissimilar systems. Even within the same corporate organization it is very common to find different system types, applications and/or information structures being used. Transmitting data between such systems can be a time-consuming and expensive task. Under current practices, data transfer between dissimilar systems is often facilitated by the use of customized software applications known as xe2x80x9cadaptersxe2x80x9d. Some adapters xe2x80x9cpullxe2x80x9d data, i.e., extract it from the source system in the data format of the host system or host application, convert the data into another data format (e.g., EDI) and then sometimes convert it again into yet another data format (e.g., XML) for transmission to the destination system. Other adapters xe2x80x9cpushxe2x80x9d data, i.e., convert the data from the transmission data format (e.g., XML) to an intermediate data format (e.g., EDI) if necessary, then convert it to the data format of the host system or application at the destination system, and finally loading the data into the destination system. All of these adapter steps are performed on the host systems using the host systems"" CPU. Thus, in adapter-based systems, CPU load considerations may affect when and how often data pulls can be scheduled. For example, data pulls may be scheduled for late nights so as slow down the CPU during daytime ONTP (on line transaction processing). A need therefore exists for a system architecture which can allow the transmission of data between dissimilar systems while minimizing the associated load imposed on the host system CPU.
Network routers are known which direct data packages on a network in accordance with ID codes embedded in the data packets. However, these routers typically direct data packets between similar nodes on a single network. It is now becoming increasingly common to transmit data across multiple networks, and even across different types of networks. A need therefore exists for a router which can direct data over networks of different types in accordance with associated ID codes. A need further exists for a router which can automatically transform a data packet having a first data format into a second data format.
It is well known that when large amounts of data are being transmitted between systems, a system error (i.e., stoppage) and/or data loss (i.e., dropout) may occur. With conventional adapter-based system architectures, debugging a system stoppage can be very challenging because of the large number of conversion processes involved, and because most systems do not have an integrated way to indicate the point at which processing stopped, relying instead upon error logs. A need therefore exists for a system architecture in which processing status information is an integral part of the data packets transmitted over the networks.
Further, with adapter-based systems, even after the processes have been debugged, it is often necessary to wait (e.g., until the time of day when host system CPU demand is low) to replace lost data in order to avoid adverse impact on the company""s business. For example, if the host system is used for OLTP (on line transaction processing) during the day, pulling bulk data from the host system in order to replace data lost in a previous data transfer may be delayed until the late night hours. Of course, the delay in processing the data can have an adverse impact of its own. A need therefore exists for a system architecture which allows for the replacement of lost data while minimizing the impact on the source host system.
The present invention disclosed and claimed herein, in one aspect thereof, comprises a method for method for load balancing a distributed processing system having a plurality of processing nodes associated therewith. A portion of a process is first received at one of the processing nodes for processing of that portion thereat. The amount of processing at that node is then determined that will be required to complete the received portion of the process as a processing load on that one processing node. A determination is then made if the amount of processing exceeds a predetermined threshold. If the amount of processing exceeds the predetermined threshold, then information is transferred to other nodes in the network that the one of the processing nodes is unavailable for further processing.