Computer-aided techniques are known to include Computer-Aided Design or CAD, which relates to software solutions for authoring product design. Similarly, CAE is an acronym for Computer-Aided Engineering, e.g. it relates to software solutions for simulating the physical behavior of a future product. CAM stands for Computer-Aided Manufacturing and typically includes software solutions for defining manufacturing processes and operations.
In computer-aided techniques, the graphical user interface (GUI) plays an important role as regards the efficiency of the technique. Most of the operations required for manipulating and/or navigating the modeled objects may be performed by the user (e.g. the designers) on the GUI. Especially, the user may create, modify, and delete the modeled objects forming the product, and also explore the product so as to comprehend how modeled objects are interrelated, e.g. via a product structure. Traditionally, these operations are carried out through dedicated menus and icons which are located on the sides of the GUI. Recently, CAD systems such as CATIA allow calling these functions nearby the representation of the product. The designer does not need anymore to move the mouse towards menus and icons. Operations are thus available within reach of the mouse. In addition, the operations behave semantically: for a given operation selected by the designer, the CAD system may suggests to the designer, still nearby the mouse, a set of new operations according to the former selected operation that the designer is likely to select.
Also known are Product Lifecycle Management (PLM) solutions, which refer to a business strategy that helps companies to share product data, apply common processes, and leverage corporate knowledge for the development of products from conception to the end of their life, across the concept of extended enterprise. By including the actors (company departments, business partners, suppliers, Original Equipment Manufacturers (OEM), and customers), PLM may allow this network to operate as a single entity to conceptualize, design, build, and support products and processes.
Some PLM solutions make it for instance possible to design and develop products by creating digital mockups (a 3D graphical model of a product). The digital product may be first defined and simulated using an appropriate application. Then, the lean digital manufacturing processes may be defined and modeled.
The PLM solutions provided by Dassault Systemes (under the trademarks CATIA, ENOVIA and DELMIA) provides an Engineering Hub, which organizes product engineering knowledge, a Manufacturing Hub, which manages manufacturing engineering knowledge, and an Enterprise Hub which enables enterprise integrations and connections into both the Engineering and Manufacturing Hubs. All together the system delivers an open object model linking products, processes, resources to enable dynamic, knowledge-based product creation and decision support that drives optimized product definition, manufacturing preparation, production and service.
Such PLM solutions comprise a relational database of products. The database comprises a set of textual data and relations between the data. Data typically include technical data related to the products said data being ordered in a hierarchy of data and are indexed to be searchable. The data are representative of the modeled objects, which are often modeled products and processes.
Product lifecycle information, including product configuration, process knowledge and resources information are typically intended to be edited in a collaborative way.
A number of systems and programs are thus offered on the market for the design of objects (or parts) or assemblies of objects, forming a product, such as the one provided by Dassault Systemes under the trademark CATIA.
These CAD systems allow a user to construct and manipulate complex three dimensional (3D) models, or two dimensional (2D) models, of objects or assemblies of objects. CAD systems thus provide a representation of modeled objects using edges or lines, in certain cases with faces. Lines or edges may be represented in various manners, e.g. non-uniform rational B-splines (NURBS). These CAD systems manage parts or assemblies of parts as modeled objects, which are mostly specifications of geometry. Specifically, CAD files contain specifications, from which geometry is generated, which in turn allow for a representation to be generated. Geometry and representation may be stored in a single CAD file or multiple ones. CAD systems include graphic tools for representing the modeled objects to the designers; these tools are dedicated to the display of complex objects—the typical size of a file representing an object in a CAD system being in the range of one Megabyte per part, and an assembly may comprise thousands of parts. A CAD system manages models of objects, which are stored in electronic files.
2D or 3D models created by a user with CAD software thus contain geometric objects such as points, vectors, curves, surfaces and meshes. These objects are usually represented with floating-point values as well as other data types.
A floating point value is a value of a data type used to represent a number which belongs to the real numbers (in the mathematical sense). One of the most widely used standard format for floating point values is the double-precision floating point defined in the IEEE 754 format standard, more particularly the IEEE 754-1985. In this format, a floating point value a representing real number ã is defined by a sign, an exponent and a mantissa on 64 bits. If a is a 64-bit floating point value in the IEEE 754 standard, we can write a=(s, e, m) with the following components: the sign s (integer coded on 1 bit), the exponent e (integer coded on 11 bits), and the mantissa m (integer coded on 52 bits). Then, by definition of the standard, if 0<e<211−1, a is said to be normalized and represents the real number
      a    ~    =                              (                      -            1                    )                s            *              2                  e          ⁢                      -                    ⁢          bias                    *              (                  1          +                      m                          2              52                                      )            ⁢                          ⁢      with      ⁢                          ⁢      bias        =                            2                      11            -            1                          -        1            =      1023.      If e=0 and m=0, then a is said to be a zero and represents the real number ã=0. If e=0 and m is different from 0, then a is said to be denormalized and represents the real number
      a    ~    =                              (                      -            1                    )                s            *              2                  1          ⁢                      -                    ⁢          bias                    *              m                  2          52                    ⁢                          ⁢      with      ⁢                          ⁢      bias        =                            2                      11            -            1                          -        1            =      1023.      If e=211−1, then a is said to be invalid and does not represent any number.
A basic functionality provided by a CAD software is the ability to store on a persistent support the models created or modified by the user during a first session, and to allow these models to be reopened later for further use (e.g. in a file on the local disk, or on a server). The models can for instance be opened later in a second session of the same software, albeit with a different version of this software or on another platform. The platforms can differ in terms of hardware (different CPU) or in terms of software (different language compiler or interpreter). The model in the second session after opening should be exactly the same as the model in the first session before storing. Therefore, the storing must be lossless (i.e. involve no loss of information) and stable across different platforms (i.e. such that the opening of the model provides the same result on different platforms which support the data types used to define the model).
Stability issues appear when the storing and the reopening of the model involve transforming the data stored to define the model (e.g. by compressing and decompressing the data), particularly if the transformation of the model involves arithmetic operations. Indeed, different platforms provide different results for the same operations depending on the data type. For example, if a, b and c are floating point values, then some platforms will compute the operation a+b+c as (a+b)+c while some other platforms will compute the same operation as a+(b+c), which will not necessarily lead to the same result. Moreover, floating point arithmetic involves intermediaries to perform the computations. These intermediaries do not have the same bit length on different platforms, which leads to different results. Thus, the same floating point operations performed on different platforms can lead to different results although the operations are performed on the same data. The document “What every scientist should know about floating point arithmetic”, ACM Computing Surveys, Vol. 23, No 1, March 1991, by David Goldberg, presents issues related to operations on floating point values. In the following, stability will be said to be ensured for an operation (or a series of operations) if the operation(s) leads (lead) to the same result on any regular platform.
Models can be stored on the persistent support in a straightforward implementation, i.e. without compression. In this implementation, the floating point values and other data defining a given geometric object are stored as such. This straightforward method (i.e. without compression) is notably used in CATIA and in other CAD software. With such a method, the storing is lossless. Indeed, the data defining the model is not modified before storing, and there can therefore not be any loss of data. The storing is also stable. Indeed, the data defining the model is not to be transformed when the model is reopened, because the data is not compressed. However, such a method fails to optimize the storage size of a CAD model.
In the field of data compression in general, delta-encoding is a way of compressing data that represents a target object by storing the difference between the target object and a known reference object, instead of the target object itself. This is advantageous if the difference can be stored in less space than the target object.
Predictive encoding is a variant of delta-encoding where the reference is not an actual object taken among the data but is computed from one or several actual objects using a predictor function. Thus, the predictor function predicts a reference object from actual objects. Instead of storing the target object as such (i.e. without compression), the difference between the predicted reference object and the target object is stored. The closer the prediction is to the current object, the smaller the difference is, and therefore the less storage space it takes to store the difference. The efficiency of the compression thus depends on the accuracy of the prediction.
Quantization is another compression technique. Quantization is used for the compression of data comprising floating point values, possibly in combination with delta-encoding or predictive encoding. Quantization is the process of mapping floating point values to integers. Quantization produces a loss of data as it involves truncating the least significant bits of some floating point values.
The article “Higher Bandwith X” (1994) and the Ph.D. Thesis “Compressing the X Graphics Protocol” (1995) by John Danskin describe a way of compressing geometry using “relative coordinates”, which is a form of delta-encoding. However the geometry is defined with integer coordinates. The method is therefore inappropriate for CAD models of which geometry are defined with a higher level of precision.
The article “Geometry Compression” (1995) by Michael Deering describes the compression of triangular mesh using quantization of floating-point numbers and delta-encoding between neighbors. The article “Triangle Mesh Compression” (1998) by Costa Touma and Craig Gotsman also describes the compression of triangular mesh, but using quantization and predictive encoding. The articles “Geometric Compression Through Topological Surgery” (1998) by Gabriel Taubin and Jarek Rossignac and “Compressing Polygon Mesh Geometry with Parallelogram Prediction.” (2002) by Martin Isenburg and Pierre Alliez describe a similar approach with a different prediction scheme. The prediction is computed by a linear combination of other points in the mesh. All these methods notably present the shortcoming of producing a lossy compression because of quantization.
The article “Out-of-core Compression and Decompression of Large n-dimensional Scalar Fields” (2003) by L. Ibarria et al. describes a prediction encoding method for floating point data. The prediction function involves floating point arithmetic computation and has thus the shortfall that stability is not guaranteed across different platforms. Indeed, as mentioned above, floating point arithmetic computations do not produce the same result on different platforms.
The article “Lossless Compression of Floating-Point Geometry” (2004) by Isenburg et al. describes a prediction encoding method for floating point data. As above, stability is not guaranteed across different platforms.
U.S. Pat. No. 5,793,371, U.S. Pat. No. 5,825,369, U.S. Pat. No. 5,842,004, U.S. Pat. No. 5,867,167, U.S. Pat. No. 5,870,094, U.S. Pat. No. 5,905,502, U.S. Pat. No. 5,905,507, U.S. Pat. No. 5,933,153, U.S. Pat. No. 6,047,088, U.S. Pat. No. 6,167,159, U.S. Pat. No. 6,215,500, U.S. Pat. No. 6,239,805, U.S. Pat. No. 6,522,327, U.S. Pat. No. 6,525,722, and U.S. Pat. No. 6,532,012 describe similar methods and none of these documents addresses the issue of stability.
The article “Lossless Compression of High-volume Numerical Data from Simulations” (2000) by Engelson et al. describes compression of floating point values. The values do not represent geometric objects, but rather are a sequence of values that changes smoothly and are parameterized by a given variable. The article provides an example of a sequence of three values (a1, a2, a3) which is a linear growing sequence (i.e. a3−a2≈a2−a1). A simple prediction encoding scheme for this sequence would be to take ap3=a2+a2−a1 as the prediction for a3. The difference between the predicted value and actual value is Δ2a3=a3−ap3=(a3−a2)−(a2−a1). As the sequence is linear, the prediction is good and Δ2a3 is small: (a1, a2, Δ2a3) can be stored, which takes less storage size than the original sequence. A problem, noted by the article, is that the computation of the prediction involves floating point arithmetic operations, which prevents stability on different platforms. The article thus introduces the notion of the integer representation of a floating-point number. If p is a floating-point 64-bit number, its integer representation Int(p) is defined as an integer that is represented by the same 64-bit string asp. The integer representations are defined as b1=Int(a1), b2=Int(a2), b3=Int(a3). The compression then applies the prediction-encoding described above on the sequence (b1, b2, b3) and stores (b1, b2, Δ2b3) with Δ2b3=(b3−b2)−(b2−b1). With the aim of guaranteeing stability, the document thus suggests performing the following steps: converting m consecutive floating point values into their integer representations and computing a sequence of classical integer subtractions on these integer representations.
However with some numerical values, the method of Engelson et al. is totally inefficient. For example, using the notations of the document, the sequence of floating point values (a1=1.5, a2=2.0, a3=2.5) is considered (the floating point values are here referred to by the real number that they represent). This sequence is linear and the prediction-encoding scheme described above should theoretically be applied very efficiently on these floating point numbers, using floating point arithmetic, because the difference Δ2a3 is exactly 0. If one however applies the method of the article, then Δ2b3=(b3−b2)−(b2−b1) has 51 significant bits. This is due to the fact that the integer difference between the integer representations of two floating point values does not only depend on the floating point difference between the two floating point values but also on the values of the two floating points themselves. In the above example, b3−b2 is different from b2−b1 because the floating point representations of 2.5 and 2.0 have the same exponent but the floating point representation of 1.5 does not have the same exponent. The efficiency of the compression is therefore not satisfying. Thus, a first shortfall of this method is that it does not work well enough (i.e. the compression rate is not high enough) on some types of sequences. The article further states: “The fixed step difference algorithm works well if the sequence Int(a) can be approximated by polynomials”. In real applications, however, it would be better to assume that a can be approximated by polynomials. Thus in real applications the method of performing integer difference on the sequence Int(a) would lead to bad predictions.
Another shortfall is that the computation of Δ can only use subtraction, not other operations. For example the multiplication and division cannot be applied to the integer representations. In other words, the difference between the integer representation of two floating point numbers may be representative of the difference between, the two floating point numbers on some cases (with at least the exception described earlier), but the multiplication (or division, or addition) of the integer representations is not representative of the multiplication (or division, or addition) between the two floating point numbers. For the addition for example, this is notably because the exponents of the two floating point values transformed in their integer representation would be added. This severely limits the prediction schemes that can be used. Thus, the prediction accomplished is not as accurate as possible and the compression rate is impacted.
The article “Fast Lossless Compression of Scientific Floating-Point Data” (2006) by Ratanaworabhan et al. and the article “Fast and Efficient Compression of Floating-Point Data” (2006) by Peter Lindstrom and Martin Isenburg describe similar techniques with the same shortfalls.
It is an aim of the invention to provide a method which is suitable to efficiently reduce the storage size of a CAD file. Such a solution would reduce the cost of the storage infrastructure and increase the speed of sending or receiving CAD models over a network.