In this document, the concept of an “electronic” document should be understood broadly to cover the representation of a document in the form of a digital data structure that can be stored in the memory of a computer and that can be sent from one computer to another.
In the field of indexing electronic documents, it is known to consider each electronic document as an ordered set of objects (words, fields, . . . ) and to prepare an inverted index of those objects, as explained below with reference to FIG. 1.
That figure shows a document base BDOC having three electronic documents DOC1, DOC2, and DOC3 together with an inverted index IDX1 of the document base.
In this example, each document may be considered as an ordered set of words; for example, the document DOC1 is the ordered set of words “the” “car”, “is”, and “red”.
In the set of FIG. 1, the document base BDOC thus has six words OBJ1 to OBJ6, i.e. the words “the”, “car”, “is”, “red”, “blue”, and “truck”.
For each word (e.g. OBJ1, “the”) the inverted index IDX1 puts into association an information block (BLOC1) comprising:                the documents (DOC1, DOC2) in the document base (BDOC) that include said word; and        the position of the word in said document.        
For example, the word “the” occupies the first position in each of documents DOC1 and DOC2.
It is also known to consider a document as an ordered set of fields, where a field is an identifiable portion of the document comprising at least a name and a content.
For example, in the example of FIG. 2, each of the documents DOC1 to DOC3 in the document base BDOC comprises two ordered fields having the names “Header” and “Title”, with the content CONT1 of the field “Title” of document DOC1 being “the car is red”.
For each field (e.g. OBJ12, “Title”), the inverted index IDX2 puts into association a block of information (BLOC12) comprising the documents (DOC1, DOC2, DOC3) of the document base that include this field, and the position of the field in the document (first title “T1” of first header “E1”).
In the example of FIG. 2, the block BLOC12 also includes the content (CONT1, CONT2, CONT3) of the field in each of the documents, with an empty content being represented by the symbol “Ø”.
Generalizing from FIGS. 1 and 2, each document of the document base BDOC may be considered as an ordered set of objects OBJi (words, fields), with the inverted index IDX1, IDX2 associating each of the objects OBJi with a block of information BLOCi that includes at least the documents that include said object and the position of the object in said documents.
Sometimes, an inverted index need not contain the positions of the objects in the documents.
The pair constituted by a word OBJi and its associated block of information BLOCi is commonly referred to as an “object descriptor” and is written DESCi.
Inverted indexes as described above are very practical since they make it easier to identify the documents in a document base that include a particular object (field or word), and when the inverted index also stores position, it enables said object to be located in the document.
Unfortunately, when using methods that are known in the state of the art, creating and updating inverted indexes for document bases that include a very large number of documents, e.g. several millions of documents, constitute operations that are excessively expensive in terms of computer computation time.