1. Field of the Invention
The present invention relates generally to an Internet Search Engine Optimizer method and system, hereinafter referred as Optimizer. More particularly, the present invention relates to an interface product that works independently and in parallel with a browser client and search engine supercomputer server architecture that gathers, analyzes and distills input information interactively. The Optimizer analyses the end user's input and converts it into a search pattern. For each valid search pattern the Optimizer continuously maintains and updates a pre calculated and pre processed array or collection of best fit web page responses.
Search Engines are based on Boolean algebra eigenvector algorithms that are used to parse and filter information indices until the top page ranks are determined and displayed to the end user. Unfortunately, some specific keywords combinations may be too narrow and confound a search by hiding optimal results. Search Engines are predominately configured to perform one request to one reply search patterns. Each search is processed from the ground up without taking into account many requests belonging to one reply. A session consists of consecutive related and unrelated search requests to reach the final destination.
The Optimizer simultaneously keeps in existence for each search pattern its corresponding virtual simulation environment that contains all relevant bound web pages. Each virtual simulated environment possesses a relative Master Index. The Optimizer continuously purifies and Synchronizes the plurality of relative Master Index that permits to match/merge and then correlate the Internet's Master Index in real time.
The Optimizer continuously scans and detects the environment in real time for new content with significant difference quality to update each search pattern's virtual environment partition relative Master Index and associated collections of top (n) pages. The Optimizer heuristically reads the content of each web page by page, paragraph, sentence, and grouping of words. Existing Master Index has an absolute rank value for each web page.
The Optimizer rank value is dynamically adjusted by matching independent variables and related keywords belonging to the search pattern to generate a content value. The Optimizer “cherry picks” the best content value web pages as output. The output is forward chained back to the end user's terminal and displayed.
The Optimizer is a method and system for simulating Internet browser search capacities that cleans, standardizes, organizes, and transforms the massive amount of data into a lingua franca comprising of valid keywords, keyword patterns for a given language, and unique geospatial patterns contained in the Internet collectively known as patterns that exist in web page. The comprehensive collection of search patterns with their relative Master Index are stored and continuously updated as web crawlers detect changes in the environment.
Each Search Pattern consists of at least one independent variable, e.g. (I), (J), (K), (X), (Y) and (Z). Search Patterns with 0 independent variables use Boolean algebra techniques that find the final destination within the massive (U) or Internet environment.
2. Related Art (U.S. Patent Application Ser. No. 10/926,446)
Partial Differential Equation Vectors Model: Telecom Super Switch teaches the tradition Vector is inefficient. For example: two persons live in Miami, one calls Guatemala and the other calls Argentina at the same time. The call to Guatemala travels 2,000 km and the call to Argentina travels 6,000 km, both speak for 1 hr. The person calling Argentina spends $1, whereas the person calling Guatemala spends $9. Distance has nothing to do with cost, thus distance is just one independent variable to solve this equation. The equation requires a plurality of independent variables. Thus we must use Partial Differential Equation Vectors instead traditional vector.
Applying Set Theory to break the call into circuits that in the telecommunications jargon are referred as Legs. A call has at least two unique and distinct Legs: Leg A, origin, and Leg B, destination. Using Set Theory, the environment U can be divided into three independent networks: Fixed (X), IP Telephony (Y) and Wireless (Z).
A Simple Call exists when the call uses a single network (X, Y or Z). A Hybrid Call exists when the call uses two independent networks such as (X, Y), (X, Z) or (Y, Z). A Complex Call exists when the call must roam outside of the environment, Leg X for the origin, and Leg Y for the destination. When Leg A belongs to competitor wireless network, the usage fee surcharge of the network while roaming can be described as Leg V, thus when Leg B is under the same constraints the system uses Leg W.
E.g. A call uses three different networks Fixed, IP Telephony and Wireless (I, J, K), each independent variable solves the billing entity and resultant vector for the call. The Switch controlling the call uses its Partial A and Partial B functions to create a final resultant vector that includes all the circuits belonging (I, J, K) for just one call. Yes, they are three independent calls one per network which is billable, yet in fact there is only one call.
3. Related Art: (U.S. Patent Application Ser. No. 10/852,394)
Computer Network System: consists of a plurality of nodes, where each one is programmed with Artificial Intelligence to perform predefined ad hoc tasks that are logistical rationalized based on the current conditions of the environment. The computer network system is synonymous with Superset(U). The cluster is divided into three geospatial tiers: a) Global, b) Regional, and c) Local. Each tier has the following functions:                a. Provisioning.        b. Total Quality Management or (TQM).        c. Data Manipulation.        d. Management Information Systems (or MIS).        e. Expert Information Systems (or EIS).        f. Inventory Control.        
System Nodes: All work collectively and independently from each other, and simultaneously in parallel analyze, evaluate, gather and process information from the environment in real time. From incipiency upon receiving the fuzzy logic piece of information that triggers a new task or update pending activities. Each node is assigned to Superset(I), Set(J), or Subset(I, J, K) cluster tier, and to the geospatial domains (X) or global, (Y) or regional, and (Z) local to create sub clusters Elements (I, J, K, X, Y, Z) that help to build the managerial hierarchy as follows:
Managerial Hierarchy: The Summit Tier coordinates Business Intelligence and Invoicing databases via the Internet that allows users to have access to their information in real time. The Middleware Tier manages UCommerce warehouses based on geographical area. The Lower Tier controls a plurality of points of presence belonging to 3rd party suppliers, wholesalers and retailers, and collectively constitutes the workhorse of the system.
Node Synchronization and Buffer Resources: Every predefined cycle each node synchronizes the latest inventory and optimizes via artificial intelligence programming its organizational management logistics. Nodes can request siblings for any excess buffer resources to complete a task using vertical and lateral synergy. Parent nodes can use their chain of command to coordinate their subordinates to complete a task. Members of different regional clusters can synergistically collaborate to process tasks. Each node is able to replace and perform the organizational task of at least one node and collectively the computer network system engulfs a global supplier.
Eliminates the Spaghetti Phenomena: nodes interaction with the environment gathers, distills, analyzes and then standardizes and converts the raw information into primed lingua franca data that is quantified, qualified, organized and transformed, so that Information Certainty is achieved and thus removes the chaos and anarchy or Spaghetti Phenomena.
Primes Vector CDR: Lingua franca messages are primed as a Vector CDR and contain the vector trajectory and all pertinent transactional segments information. Prior art sends all the transactional segments to a centralized billing data warehouse that match/merges and correlate the information into a final billing entity. Whereas the computer network assigns a hierarchical owner and plots circuit by circuit the vector trajectory and activates all relevant nodes to the transaction so that nodes can communicate amongst themselves via forward and reward chaining. Nodes send all dynamic and fixed costs to hierarchical owner so it can match/merge and correlate the rated billing entity absent of a centralized billing data warehouse.
Interact with the Human Resources: The human resources of the organization proactively can use business intelligence software to send parameters to the computer network system and directly control their own network, and then send command instructions with the latest conditions of the environment so the computer network system can optimally analyze, coordinate, prioritize and synchronize throughput.
Multiple Tiers of Nodes: Middleware and Summit tier nodes perform data warehouse functions, and monitor and control their chains of command and virtually simulate the organization. Lower tier nodes remove redundancy, geographically distribute activities, and update information.
Avoids Taxing the Throughput: The computer network system monitors the limited resources and capacities of the network to avoid taxing available throughput in real time. Each node can create, plot and update purchase orders as soon as new relevant messages from the environment are detected.
Uses Synergy to Maximize Throughput: Upon receiving environment command instructions each node can manage and organize the flow of information of their subordinates from predefined point A to point B routes to avoid clogs and saturation. Each node via synergy attempts to maximize throughput, and assign, prioritize and shares with other nodes that have substantial buffer resources, since unused resources are considered waste, which is one of the confounding variables directly related in creating the Spaghetti Phenomena.
Analyzes Network Traffic: Network traffic is analyzed by tier as the informational traffic is measured based on the latest command instructions and known routing throughput limitations of each given domain. The summit nodes of each tier performs the non obvious task of removing complexity in order to be a real time system by eliminating data redundancy, filtering, quantifying, qualifying data as good or garbage, and minimizing waste before beginning to transmit the data through the managerial hierarchy system.
Informational Certainty: Nodes are programmed to remove the Spaghetti Phenomena at incipiency one transaction at a time to reach Informational Certainty at the organizational level to be considered a real time invention.
Stabilizes the Flow of Information: Summit and Middleware nodes stabilize the flow of information concerning financial conditions, inventories, shipping costs and tariffs required for billing, and update the XLDB database with trending statistics used to optimize throughput. Each node is autonomous, and through means of the managerial hierarchical synergy works in parallel with others nodes to work as a single unit. Each node processes network information and then simulate, plot, map, tract and vector each message to create a virtual instance of the organizational environment.
Real Time System: Once the ‘Spaghetti Phenomena’ is eliminated, informational certainty is achieved and thus a state of balance, harmony and proportion exists and the distributed configuration removes the need for a central mainframe. Hence, a real time solution consists of synergistically synchronizing all the computer network system functions.
Evaluates Network Resources: Each node has its own location identification and is assigned to a local, regional or global geospatial domain. Each activity and purchase order is processed in parallel, starting from the point of origin and ending at the point of destination. The computer network system rearward chains the routing vector information through the simulation network to the point of origin and analyzes and evaluates the best usage of network resources as follows:
a. Administer, coordinate, control, manage, synchronize and transform the network.
b. Use Business Intelligence to predict when a customer becomes dissatisfied.
c. Manages the flow of money in real time.
d. Send summarized information packets to their organizational subordinates.
e. Assign cost to each activity and limiting each resource.
f. Uses synergy to load balances the demand on the organization's resources.
g. Work always at maximal assigned throughput and is redundant to compensate for network faults.
h. Work in parallel with the simulated Legacy System.
i. Parent nodes create command messages with resource allocation instructions.
j. Creates partial vectors that measure one independent environment.
l. Match/merge all partial vectors to create the final billing entity or purchase order.
4. Related Art (U.S. patent application Ser. No. 11/584,941/Issued U.S. Pat. No. 7,809,659)
XCommerce (2000): based on UCommerce to simulate the entire superset of valid keyword regular expression requests and converts the results set into a vector based statistical data that optimizes accuracy. XCommerce (2000) is the deductive reasoning server side supercomputer that simulates, standardizes and transforms the Internet into a plurality of concurrently working environment using a Managerial hierarchical method of indexing and searching as follows:
Managerial Hierarchical Relationship Indexing: a request is broken down into keywords and clusters, and then converted into a search pattern that optimally minimizes the quantity of valid web pages with regards to a given search.
Determining what is Relevant and Irrelevant: Keyword and Cluster: serve as the basis of each Managerial Hierarchical Relationship Index to analyze and distill what is relevant and irrelevant. Irrelevant web pages are discarded completely from analysis.
Partition the Environment into Blocks: the environment is subdivided into a plurality of blocks that are arranged based on Managerial Hierarchical levels as follows:
Each Search Pattern restricts the geometric rate of growth of the Internet environment by creating the relevant environment that is used by all the managerial relationship levels when purifying the search process.
The Internet environment is distilled by applying the three Managerial Hierarchical levels (1001) primary independent variable creates the (720) Simple Pyramid or Block that maps an improved environment, (1002) secondary independent variable creates the (730) Hybrid Pyramid or Sub Block that maps an optimal environment, and (1003) tertiary independent variable creates the (740) Complex Pyramid or Mini Block that maps an optimal solution.
Identifies Static Search Patterns: the computer network system determines if the search pattern already exist and if yes obtains the top (n) pages from the databases and sends the output to the end user.
Calculates Dynamic Search Patterns: uses relationship indices to create optimal size partitions and compares remaining keywords and clusters to determine if they match against the content of the top (n) pages. When a match occur each web page value is dynamically adjusted by each keyword or cluster relative vector value and picks the top (n) pages.
Finds New Search Patterns: stores into the database each new search patterns and associated top (n) pages.
Displays Top (n) pages: Sends and displays the top (n) pages to the end user's terminal.
5. Related Art (U.S. Patent Application Ser. No. 12/146,420/Issued U.S. Pat. No. 7,908,263)
A search engine optimizer (hereinafter referred as Cholti), which works independently with a browser and search engine supercomputer to gather, analyze and distill input information interactively. The optimizer reorganizes the request as optimal input that is sent to the search engine, and then the output is sent to the end user. Each request is converted into a pattern and stored in an advanced Glyph format, permitting the optimizer to identify any left (linguistics) or right (geospatial) side of the brain checkmate combinations required to achieve certitude.
6. Related Art (U.S. Patent Application Ser. No. 12/764,934)
Lottery Mathematics: Cholti (1000) and XCommerce (2000) show how to improve accuracy of a requests by using primary independent variables (1001, 1002, 1003) (I, J or K) to map and create managerial hierarchical partitions of the Internet environment such as from top to bottom Superset(I), Set (I, J) and Subset (I, J, K) datasets.
Hot and Cold analysis: uses lottery mathematics to estimate the size of the environment and assigns (1001) primary independent variable (I) as the filter with the following formula: (x!−(x−6)!)/6! E.g. the number of permutations for a 10 number draw is (10!−4!)/6! 4!=24, 6!=720 and 10!=3,628,800. (3,628,800/24)/720=210 permutations. Thus each grid has 1/210 in being the one that appears in the draw. The English language estimated Master Index size of the environment in the year 2010 is Lottery—165_Basis or 25,564,880,880 web pages.
E.g. the number of permutations for a 165 number draw=165!−(165−6)/6! or 25,564,880,880.
The quality of the Glyph that represents (I) or primary index determines the Mass. E.g. lithe keyword Civil=(I) the Mass=1, and if cluster “American Civil War”=(I) the Mass=2.
TABLE 1Size of environment based on Massa. Mass = 0 (Lottery_165_Basis = 25,564,880,880) or 165! − (165 − 6)!/6!b.Mass = 1 (Lottery_100_Basis = 1,192,052,400) or 100! − (100 − 6)!/6!c.Mass = 2 (Lottery_70_Basis = 131,115,985) or 70! − (70 − 6)!/6!d.Mass = 3 (Lottery_50_Basis = 15,890,700) or 50! − (50 − 6)!/6!e.Mass = 4 (Lottery_40_Basis = 3,838,380) or 40! − (40 − 6)!/6!f.Mass = 5 (Lottery_30_Basis = 593,775) or 30! − (30 − 6)!/6!g.Mass = 6 (Lottery_20_Basis = 38,760) or 20! − (20 − 6)!/6!h.Mass = 7 (Lottery_15_Basis = 5,005) or 10! − (10 − 6)!/6!
Simulates the Human Brain: Each linguistic Glyph is assigned to the [L] left side of the brain and each geospatial Glyph is assigned to the [R] right side of the brain and the Anchor is the best common denominator Glyph.
The Dominant Tendency of each request is given a [L] linguistic, and [R] geospatial tendency. and then Cholti reorganizes, maps and plots the Glyphs to create a Managerial Hierarchical Relationship Index.
Human Brain Intelligence: transforms each Search Pattern and identifies independent variables based on mass partitions of the Internet in order to create Join, Simple, Hybrid, Complex and Optimal Pyramids.
Human Brain Wisdom: analyzes the top (n) pages and source references using deductive reasoning to expand each [AX], [BX] and [CX] Glyph equation with key featured association dependent variables Q(x, y, z) filters.
Cholti (1000) picks one of four Search Strategies: [LL], [LR], [RL], and [RR], which have different set of business rules to analyze the Internet and limits the maximal size of any partition not to exceed 1 billion or (2^30) web pages and thus eliminates the exponential rate of growth of the environment, which is the principal confounding variable.
E.g. the environment can grow geometrically to 40 billion or 100 billion or 1 trillion web pages, but once the Dominant Tendency picks the relevant environment that maps 1 billion web pages, the lion share is irrelevant.
[L+R] Managerial Relationship Events: lithe independent variable (I) is represented by the Historical Event “American Civil War” {1863}, where “American Civil War” is the left side of the brain variable (I) and 1863 is the right side of the brain (X), and are merged to a Single Event or Superset(I!) with Mass=3; the Double Event or Set(I,J)!! with Mass=5, and finally for Triple Event or Subset (I, J, K)!!! with Mass=7 consisting of [L] left side of the brain (I, J, K) and [R] right side of the brain (X, Y, Z) independent variables.
First Significant Event or (FSE): is a vague search that maps an improved environment. The Internet environment (a, b, c, d, e, f) becomes the improved environment (FSE, b, c, d, e, f) for Superset(I) dataset.
TABLE 2FSE Size of environment based on Massa.Mass = 1 (Lottery_100_Lucky_1 or 75,287,520) or 100! − (100 − 5)!/5!b.Mass = 2 (Lottery_70_Lucky_1 or 12,103,014) or 70! − (70 − 5)!/5!c.Mass = 3 (Lottery_50_Lucky_1 or 2,118,760) or 50! − (50 − 5)!/5!
Second Significant Event or (SSE) is a concise search that maps an optimal environment. The Internet environment (a, b, c, d, e, f) becomes the optimal environment (FSE, SSE, c, d, e, f) for Set(I, J) dataset.
TABLE 3SSE Size of environment based on Massa.Mass = 1 (Lottery_100_Lucky_2 or 3,921,225) or 100! − (100 − 4)!/4!b.Mass = 2 (Lottery_70_Lucky_2 or 916,895) or 70! − (70 − 4)!/4!c.Mass = 3 (Lottery_50_Lucky_2 or 230,300) or 50! − (50 − 4)!/4!d.Mass = 4 (Lottery_40_Lucky_2 or 91,390) or 40! − (40 − 4)!/4!e.Mass = 5 (Lottery_30_Lucky_2 or 27,405) or 30! − (30 − 4)!/4!
Third Significant Event or (TSE) is a precise search that maps an optimal solution. The Internet environment (a, b, c, d, e, f) becomes the optimal solution (FSE, SSE, TSE, d, e, f) for Subset(I, J, K) dataset.
TABLE 4TSE Size of environment based on Massa.Mass = 1 (Lottery_100_Lucky_3 or 161,700) or 100! − (100 − 3)!/3!b.Mass = 2 (Lottery_70_Lucky_3 or 54,740) or 70! − (70 − 3)!/3!c.Mass = 3 (Lottery_50_Lucky_3 or 19,600) or 50! − (50 − 3)!/3!d.Mass = 4 (Lottery_40_Lucky_3 or 9,880) or 40! − (40 − 3)!/3!e.Mass = 5 (Lottery_30_Lucky_3 or 4,060) or 30! − (30 − 3)!/3!f.Mass = 6 (Lottery_20_Lucky_3 or 1,140) or 20! − (20 − 3)!/3!g.Mass = 7 (Lottery_15_Lucky_3 or 445) or 10! − (10 − 3)!/3!
Fourth Significant Event or (QSE) is an optimal search that maps the optimal answer. The Internet environment (a, b, c, d, e, f) becomes optimal answer (FSE, SSE, TSE, QSE, e, f) and is a [LR] checkmate combination!
TABLE 5QSE Size of environment based on Massa. Mass = 1 (Lottery_100_Lucky_4 or 4,950) or 100! − (100 − 2)!/2!b.Mass = 2 (Lottery_70_Lucky_4 or 2,415) or 70! − (70 − 2)!/2!c.Mass = 3 (Lottery_50_Lucky_4 or 1,225) or 50! − (50 − 2)!/2!d.Mass = 4 (Lottery_40_Lucky_4 or 780) or 40! − (40 − 2)1/2!e.Mass = 5 (Lottery_30_Lucky_4 or 435) or 30! − (30 − 2)1/2!f.Mass = 6 (Lottery_20_Lucky_4 or 190) or 20! − (20 − 2)!/2!g.Mass = 7 (Lottery_15_Lucky_4 or 45) or 10! − (10 − 2)!/2!
Gamma Functions: Cholti (1000) and XCommerce (2000) teach how to create search patterns that improve the accuracy of a request using gamma functions to help create optimal size partitions of the (500) Internet.
E.g. the end user types 1863 American Civil War, which the end user automatically maps the [L] left side of the brain English language cluster “American Civil War” with [R] right side of the brain geospatial keyword to create “American Civil War”. The “War between the States” is also synonymous with the American Civil War, and thus “between the” which are dependent variables since the keywords have a Mass less than 1 are used to the Dominant Tendency and the keyword “States” which has a Mass of 1+ is Likely. Lets assume, the keywords {1861, 1862, 1864 and 1865) are Unlikely. The Likely and Unlikely Gamma function values are as follows: “American Civil War” {1863}=50!−(50!−5!)/5! or 2,118,760 pages. Plus “States” Likely Analysis: =49.9!−(49.9−5)!/5! or 2,096,762 pages. Plus Unlikely Analysis: =49.86!−(49.86−5)!/5! or 2,088,014 pages.
Search Pattern Variables: the Lucky Numbers are the (1001, 1002, 1003, 1004) independent variables or control variables that create the Pyramid objects that map the size of the environment. The Likely and Unlikely Numbers are the observable variables or dependent variables, and are considered strong filters and the Regular Numbers are the measured variables or dependent variables, and are consider weak filters that are use to create the actual content score.
TABLE 6Adjustment of the Lottery Basisa.Independent/Control Variables (Lucky Numbers)+1.00b.Dependent/Observable Variables (Likely Numbers)+0.100c.Dependent/Complement Variables (Regular Numbers)+0.010d.Dependent/~Observable Variables (Unlikely Numbers)+0.001
Partial Differential Equations: When using Partial Differential Equations usually the solution is not unique due to the fluid and dynamic conditions of the search process, and ergo the End User's keyword combination usage behavior directly affects the size of the environment (or boundary of the region) where the solution is defined.
The limitations, drawbacks and/or disadvantages of technologies are as follows: Search Engines are based on Boolean algebra eigenvector algorithms that are used to parse and filter information indices until the top page ranks are determined and displayed to the end user. Unfortunately, some specific keywords combinations may be too narrow and confound a search by hiding optimal results. Search Engines are predominately configured to perform one request to one reply search patterns. Each search is processed from the ground up without taking into account many requests belonging to one reply. A session consists of consecutive related and unrelated search requests to reach the final destination.
The Internet environment or (U) can be construed as a complex and massive volume network with billions. The Search engine supercomputer analyzes the billions of unique web pages, and then uses eigenvectors to determine the highest ranked pages from the end user's match criteria. As explained, in related subject matters “As the size of the environment increases the level of redundancy and tax burden of a system exponentially increases”.
Transform Data: The supercomputer system cleans, standardizes and organizes the spaghetti of the environment by gathering, analyzing, distilling, managing, organizing and distributing the huge amount of information in a massive parallel distributed managerial hierarchical structured supercomputer that removes redundancy, latency and the organizational tax burden.
Synchronize tasks: Cholti (1000) and XCommerce (2000) are decentralized parallel clustered supercomputers consisting of a plurality of nodes, which are specifically arranged in three tiers. The summit tier coordinates and executes global tasks. The middle tier coordinates and executes regional tasks. The lower tier coordinates and executes localized tasks and processes the lion share of non critical transactions. The summit node of each tier synchronizes tasks by sending command messages that assigns the fuzzy logic state of each node belonging to its chain of command.
Lateral and Vertical Synergy: A tier consisting of groups of nodes that are independent from other groups of nodes. Each tier partition performs mission critical tasks within their domain and works in parallel with other partitions of the same tier. Each node can shunt available resources using lateral and vertical synergy with parent, sibling or subordinate nodes to maximize available resources and continuously analyzes its own environment current conditions and forward chains summary information until reaching the summit. Then summit nodes rearward chain command messages with instructions to regulate priorities, resources availability, and notify each subordinate with coordinated and synchronized tasks constraints taking into account present network conditions to avoid saturation, clog and eliminate the tax burden of the environment’.
Remove chaos and anarchy: the XCommerce (2000) creates Vector CDR to eliminate the ‘spaghetti of the environment’ and then build command messages or Summary Information data. Command messages coordinate and synchronize each node to operate at maximal throughput capacity hence each node operates without adversely affecting the network flow of data and limits the exponential rate of growth of complexity as the size of the environment increases.
Convert Requests into Ideas: Search Engines dependency on Boolean algebra use inductive reasoning popularity scores to find the top results. In contrast, the Optimizer (1000) uses deductive reasoning interprets keyword combinations as being part of an idea being formulated by both the left and the right sides of the brain and probabilistically supplies and inserts missing gaps of information. Related art teaches that a Vector CDR can be expressed as the summation of a plurality of valid vectors. Then matches/merges a plurality of partial vectors and then correlates them to create a resultant vector containing a collection of top (n) pages possessing informational certitude.
In a nutshell, the Boolean algebra mimics inductive reasoning Watson like criminal investigation methods for finding the best results, whereas the Optimizer (1000) solves for the optimal answer using Sherlock Holmes deductive reasoning approach to decipher the actual content of each web page to find the best match.