An increasing number of entities, such as businesses, associations and/or private individuals, wishing to be present on the Internet to allow users of this network, referred to hereinafter as netsurfers, access to information offered by these entities.
For this it is necessary to have an Internet site, that is, a database, generally in the form of pages coded in Hypertext Mark-up Language (HTML), which is stored in a server connected to the Internet in such a way as to permit the transmission of these data to a netsurfer requesting them.
To permit the realization of such a request, each site of the Internet is identified by an address of its own, called the Unique Response Link (URL).
Thus, a surfer can connect to the server stocking a particular site by giving his navigator the URL of the site he wishes to consult.
However, frequently the address of the site that a netsurfer might desire to consult is unknown to him, especially when such consultation is made for the first time in view of seeking out one or more sites that can answer a request for information.
In this case it is possible to search for the URL by means of search tools called servers or search engines present on the Internet.
For example, companies such as Google, Altavista, Yahoo, Lycos, MSN, Inktomi, Fast or Voila operate such search tools.
These search tools are Internet sites making it possible to order a URL search starting out from one or more key words, such that from this key word or words a search server offers a list, called Web result, of sites and their URLs so that the netsurfer can access one or more sites that appear of interest to him, i.e., having a content pertinent to his search.
Given the extremely great number of sites present on the Internet, the search servers have an important role in the promotion of a site with netsurfers.
For example, when a new site comes on line in the Internet, it is important to reference it, i.e., to assure that this site is indexed in the databases of these search servers, so that netsurfers who have made a search, or request, by key words can be directed to this new site by selecting the URL of this site from the Web result.
In fact, the URLs in the Web result are coded in the form of a hypertext link which permits direct access to the site corresponding to the URL by means of a mouse connected to a computer.
Furthermore, it is advantageous to the responsible party of a site for this link to appear in the first responses from the list proposed by a search server on the occasion of a request when the key words of this request are pertinent to the content of the site.
In fact, it is acknowledged that a netsurfer rarely consults the responses put forward by a search server beyond the 30th or even the 20th response offered.
In fact, either this netsurfer finds a site matching his expectations in the first responses proposed or he re-establishes the key words he chose in order to resume the search.
This is why a manager, called hereinafter a webmaster, of an Internet site must check and find the referencing of this site in order to promote the frequenting of this site.
For this, this webmaster faces the problem of determining an Internet strategy which, according to the content of the site and the attempts of the netsurfers, makes it possible to set up a list of key words associated with the site, to which the content of the site must correspond and with which the netsurfers' needs can be identified in order that the search tool will guide the netsurfer toward this site via these key words.
Furthermore, the webmaster must make possible modifications to the structure of the site which can enable him to be identified optimally by the research servers because, as will later be described, it is possible that a search server might be unable to reference a site because of problems inherent in the site's structure.
Moreover, since a search tool can index a plurality of sites of similar or competitive content simultaneously, the webmaster is facing the problem of defining a website structure enabling him to figure among the first responses of a list of web results provided by a search tool.
For these operations of optimization to be taken into account by the search tool, it is indispensable that the webmaster subject the new site to the search tools so that the tools will list the site in their databases and provide it among the first responses to any pertinent inquiry.
Also, it is known that, to measure the referencing quality of a site, it is necessary to observe the visibility of this site, i.e., its accessibility by means of the search servers.
The visibility of a site is measured by observing, for a given key word, the appearance or classifying rank of this site in the list of results presented by a search tool.
Such an observation corresponds to measuring a parameter referred to hereinafter as the Index of the Rate of Penetration on the Net (ITPN).
To perform this observation, it is known to use specific software which, according to key words supplied, from a list of search tools selected and of a particular URL, perform operations necessary to obtain the classification of the site in the list of the proposed results.
This Index of the Rate of Penetration on the Internet (ITPN) measures, for each key word used by the site, the exact position which the site in question occupies in the Web results supplied by the various search tools.
In other words, these results indicate, for each search tool and each key word, a rank of appearance of the site observed in the responses proposed by the search tools to the request corresponding to one or more of the key words supplied.
Thus, each result appears in the form of a triplet of data:
(search tool)/(key word)/(rank of appearance of the site in the tools).
For example, a search concerning a site “cosmetic.com” can be performed in relation to 4 search tools named OR1, OR2, OR3 and OR4 and considering the following key words: cosmetiq
The results of the search, provided in a data processing format specific to each software used, such as the CSV format, then takes a rough or simple shape such as:
Search Toolor Keycosmeticword: ORClassingmaquillagemake-up12492156313104171318
By analyzing this table it can be seen that, for example, the URL of the site “cosmetic.com” appears in a less good position with the search tool OR4 (17th, 13th and 18th position for the words ‘maquillage’ and ‘make-up,’ respectively, than with another search tool, such as search tool OR1, or for these same search words, the URL of the site appears in 2nd, 4th and 9th place.
In like manner, such a table makes it possible that a key word, such as ‘cosmetic’ or ‘maquillage’ is better referenced than another key word, such as ‘make-up’ in this example. ue, maquillage, make-up.
However, these crude or elementary results present the problem of being given without any distinctive form permitting quick comprehension, direct by synthesis of the graded classifications of the site.
As a result, the consultant of a referencing business conducting a study for a client's account is forced to spend time in sorting and organizing the information obtained by the specific software, consequently limiting his availability for performing analyses of the performance of these sites, for example with regard to the keywords most used by the netsurfers.
In fact, the foregoing example concerns three key words and four search tools, whereas one consultant must handle tens of them, even hundreds of key words in connection with tens of search tools.
Furthermore, it should be emphasized that the consultant is also obliged to translate the results provided by a specific software in a specific language, such as CSV (Comma Separated Value, that is to say, values separated by commas) to a more generalized language so as to be able to share the results obtained by means of this software with these clients, this translation presenting the problem of again reducing the time that a consultant can devote to the analysis of the results.
The measure of the quality of the referencing of a site also requires observing the frequentation of this site, i.e., the number of netsurfers accessing this site.
Now, the measurement of frequentation presents the problem that, when a netsurfer has accessed a site via a search server in which this site is referenced, it is common for the netsurfer to enter into the memory of his computer, generally in the form of a “favorite,” the URL of the site accessed, in the degree of the importance of this site, thus avoiding another search every time he needs to connect to this site.
In this case, that is to say, when this netsurfer connects to the site in question, via its favorite, it is not possible for him to determine via the search tool that this connection is made by means of the search tool, which proves troublesome in tracking the frequentation of a site.
To this effect it is well to note that the measure of the frequentation of a site is, according to the prior art, performed by means of a specific software called “TAG” hereinafter.
A TAG entered in an Internet page is thus a little script, or program, which is executed each time that the page is read by a netsurfer.
Since then it is possible to involve a counter which increases at each run of the script, that is, at each teleloading of the page by a netsurfer.
To accomplish a frequentation measurement there are some service providers, such as Audientia, XITI, Estat or VocalCom SA, devoted to this operation and presenting their results in form specific to each supplier.
However, as with visibility measuring suppliers, the operations connected with the measurement of frequentation require a great amount of human involvement in order to obtain from rough data the fine data underlining a phenomenon, such as an increase in the frequentation of a site which the consultant wants to place before his client.
If one considers that such a presentation of treated data is generally accompanied by the consultant's commentaries, it appears that this human intervention is time-consuming, because thousands, even millions of pages may have to be considered, as well as many visitors and an extremely great number of key words and sorting words.
In fact, the volumes of data to be dealt with by a consultant for an analysis of visibility and frequentation are so great that they create many problems. The number of clients that can be dealt with simultaneously by a consultant is small, which presents the problem of limiting the performance of a consultant doing these operations.
Furthermore, this amount of information to be processed limits the time that such a consultant can allow for his analysis and therefore to the quality of his advice for situating or maintaining the site in a good position of visibility and frequentation.
Also, the storage of raw information is expensive for it requires a large memory capacity such that this information is not generally stored.
As a corollary, the lack of data storage relative to a site over relatively long periods of time, that is to say several months, prevents analysis of the development of a site in terms of frequentation and/or visibility, for example.
The large volume of raw data to be process also causes access to these data to be generally restricted to the consultant of a firm, who having a certain experience in the treatment of these data is able to identify the important data.
To sum up, it appears problematic that, according to the prior art, the operations proper to referencing are analyzed manually from raw data provided by software programs, as described previously in connection with the visibility of a site.
After that, several days of work are required to conduct the following of the referencing of each site. In fact, considering, for example, the analysis of the visibility of a site, it is necessary for the consultant of a referencing service to handle the presentation of this information so as to facilitate its comprehension and/or its interpretation, for example by means of data displays.
Furthermore, it is well to note that the consultant is generally asked to provide commentaries on these displays so as to help the client to understand the phenomenon observed, which again increases the charge for his work.
Also, this operation brings up the problem that, to produce these displays, a consultant is constrained to use displays whose parameters are limited and predetermined so as to limit the time required for the acquisition of these displays.
Lastly, it should be noted that a consultant generally presents analysis reports in a similar form for different clients, that is, there exists no personalization of the results sent to a client, because this would mean an extra amount of work performed to the detriment of the thorough analysis of the client's Website.
On the other hand, for a site to be referenced in a search engine, it is necessary to wait until this site has been run through by a program of the search tool called a “spider,” which runs through the site, reading its content and indexing its pages in the database of the search tool, these pages being associated with the key words.
Now, the spiders of the principal search engines run through the pages of a site only every twenty-eight hours, on average.
On this account, referencing by means of a spider presents the trouble that it can take around twenty-eight days, without its being really possible to cover the time during which the site is not referenced.
These spiders also present the problem that they are unable to read the content and index the pages of only a static Internet site, that is to say, one which is “frozen,” while it is not possible for them to reference all of the pages of a dynamic Internet site whose content varies.
For example, if one considers a site permitting access to the content of a dynamic database, i.e., one whose data may very, this site then appears only as a single page whose fields are fed with the content of the database.
Now, this site potentially presents as many pages as entries in the database, while the spider can read only the single physical page, which presents the problem that its analysis will not take into account the real content of the site, i.e., the content of the database.
Furthermore, it is known to put key words, referred to hereinafter as “metas” into the code of the Internet pages, intended to distinguish these pages.
However, the number of key words that can be associated with an Internet page is limited, generally to twenty key words, which does not make it possible to precisely characterize the content of each page, particularly in the case of a dynamic site connected to a database of several hundreds, even thousands, of entries.