An application typically using object states encoding is static analysis of variables in compilers. In this step of static analysis, the compiler validates the code flow by detecting the impossible variable states or transitions in each branch of the code. One other application is inventory management of physical objects such as containers on which are attached Rfids able to record the container's id and states.
Objects states are well represented by a collection of Boolean properties such as a given local variable which can be deemed valued to null, or not, initialized, or not, etc., at a given point in time. Similarly, a container may have different states depending if it is in a warehouse, on board a truck, on board a train, on board a ship, etc. During the life of objects, events drive predictable changes in objects states, called states transitions, that programs must track. Transitions on those states are coded by use of Boolean arithmetic with the usual operands (and, or, not) to implement truth tables. Truth tables typically give, for a given operation, and the entry states of one to many objects, the expected states of the objects after the transition has occurred. Coming back to the example of a variable in a program, when a local variable that is not initialized is assigned a value, it becomes initialized: in other terms, its ‘initialized’ Boolean property changes from false to true; similarly, with the example of container, when a container is unloaded from a truck into a warehouse, its ‘on board truck’ property becomes false and its ‘in warehouse’ property becomes true. The use of Boolean variables is efficient for the storage of states and the computation of states transitions because the underlying technology (memory, buses and processors) uses bits sets, usually grouped by words that comprise 32, 64 or 128 bits, at the very heart of computing and storage systems. Moreover, it implements efficient Boolean operations on those sets.
When the number of objects for which the states must be tracked and computed becomes very large, one encoding technique that is eventually efficient both in terms of space and time is to use a set of large bitfields (each set typically comprised of an array of words), each bitfield coding one of the Boolean properties, and each object being associated to a specific rank within the bitfields. Using the bitwise operators of the used programming language to operate Boolean transformation one word at a time, states transformations can be operated for objects sets instead of demanding for a per object computation. For example, if we take 8 bit long bitfields and associate the lower order bit to object 1, the following bitfield encodes a given Boolean property as being true for object 1, false for object 2, true for object 3, false for others: “0000 0101”
There is a potentially big issue with this approach though. It is very weak at detecting impossible states, that is, states that the objects cannot reach. Taking an obvious example, a variable cannot simultaneously be initialized and not initialized, or a container cannot be at the same time on board a train and on board a ship (whereas it could be on board a truck and on board a ship). If the programmer initially chooses to code the ‘initialized’ and ‘uninitialized’ properties into two separate Boolean variables, the fact that those properties are always linked by a negation may go unnoticed for a while. If this remains unnoticed, the encoding will consume one more property than needed per object, resulting into space waste and superfluous calculations. This example is obvious but less obvious examples abound when the state variables number grows. For example, in the container case, ‘on board train’ and ‘on board ship’ are negatively correlated, whereas ‘on board truck’ and ‘on board ship’ are not.
There is also a considerable amount of prior art work related to the optimization of Boolean multi-variate transformations. For an example, see ‘Two-level logic optimization’, Coudert et al., 2001 (in IKLUWER INTERNATIONAL SERIES IN ENGINEERING AND COMPUTER SCIENCE SERIES) ([Coudert 2001]). Their main focus is on providing ways to produce near optimal Boolean logic for use in dedicated hardware circuitry. An elaboration of these is very useful in any multi-variate Boolean transformation. These do not fulfill the needs exposed above by themselves though, since they only tackle the reduction of the time needed to implement given truth tables. They do not address the reduction of the number of considered variables. When applied to the initial problem of optimizing object states and transitions, this prior art would deliver a gain in computation time but not in space.
Hence, there is a need for a solution that enables programmers to define states and transitions according to the desired semantics and still implement them in an efficient way in terms of space for encoding them.
A first step in space optimization for encoding object states and transitions has been observed at least in Eclipse, an open source project on the web site http://www.eclipse.org, and more particularly in the class: org.eclipse.jdt.internal.compiler.flow.UnconditionalFlowInfo for the 3.1 version of the product. This implementation uses natural Boolean sets to encode the states of numerous objects in a relatively efficient manner by using bitfield encoding. This implementation adds new functions to the compiler, and drives the number of Boolean properties up, raising concerns about a degradation of performances on time and space. An ad hoc approach enabled the development team to identify some of the unneeded combinations and to re-encode the states by coordinating some Boolean pairs (that is, for example, the meaning of the first bit depends on the value of the second bit).
This proved error prone, gave no warranty regarding optimality of the resulting encoding, and is very inflexible (the addition of a new state variable breaks the encoding). As a consequence, in order to save space and to keep complexity under control, developers cut back on functionalities. In conclusion, bitfield encoding is part of the solution but it is not sufficient for saving space with an important number of objects.