This invention relates to expression languages used in database technologies.
Many software products require the user to define rules, queries or conditions using an expression language. A few examples include: reporting tools that may allow their users to define database queries specifying the contents of a report; ETL (Extract Transform Load) tools, such as the DataStage tool, provided by International Business Machines (IBM) Corporation of Armonk, N.Y., that allow users to enter expressions to filter or transform a data set; and data profiling tools, such as the AuditStage, also provided by IBM, that allow users to define business rules describing constraints that the data should verify in order to fulfill certain data quality requirements.
A typical problem is that the expression languages used in products of this type often have a rich and complex syntax. The reason for the rich and complex syntax is that the expression languages must be flexible enough to cover all features that a user may need. Examples of expression languages include SQL (Structured Query Language), MDX (Multi Dimensional Expression), and various types of proprietary languages.
Often, the users who must define the expressions are not always skilled in programming languages. Users of reporting or data profiling tools are typically business users and not programmers. Even for programmers, the task of defining expressions can be tedious and error prone, since the product cannot always rely on a standardized language and a proprietary syntax must be introduced, which the user must learn.
Finally, sometimes the definition of a rule or a query may be defined in different steps by different users. For example, a business user may first define a high-level definition of the expression in plain English, and then, in a second step, a programmer may implement the technical details of the expression.
In all of these cases it may be challenging to find a comfortable method allowing users to define expressions using the full possibilities of the expression language, without requesting the users to learn a complex syntax or acquire programming skills.
Different solutions have been developed over the last few years in attempts to solve the above problems and make the software easier to use for business users. One common solution is to use a free-form text field, with some content assistance. That is, the user must still have some basic knowledge of the syntax of the expression language, but while he types the expression, an editor makes suggestions about the content. Features like code completion, syntax coloring, pop-up hints, error highlighting, and so on, are common in these programming tools.
Code templates can be also used to generate a “skeleton” of the expression and provide a starting point for the user. While this method is of great use for a programmer and does not set any limits for the complexity of the expression, the method is not very suitable for a pure business user with little programming language experience. The content assistance only provides hints, but does not require that the user respect the language constraints or prevent the user from making errors. Furthermore, once an error is contained in an expression, it may be difficult for the user to find the error. The user also still has to face the syntax of the language.
Another common solution is to use a table or tree form editor instead of a free form text editor. With such an editor, the user is no longer free to type any kind of expression, like in the above free form editor. Instead, the editor constraints the edits. A table editor can make an assumption about the format of the expression and force the user to enter each part of the expression in a different cell of a table, by choosing a possible value from a list of possibilities. For example an expression like: “IF age<18 THEN status=‘child’” may be edited in a table containing 6 columns where the user enters successively “age”, “<”, “18”, “status”, “‘child’”. A tree editor may allow the expression to be edited by adding nodes to a tree. Both techniques prevent the user from syntax errors, but significantly reduce the usability or flexibility of the editor.
A table based editor makes the assumption that an expression always contains the same number of elements in the same order. It is not easy to define an editor that would allow entering of a simple expression, such as “IF age<18 THEN status=‘child’”, but also a more complex expression, such as “IF ((col1+col2)/2)>50 and abs(col3)>10 then rtrim(col4)=‘abc’”. The tree editor may allow editing rules with different degrees of complexity, but is not intuitive for the user, as the tokenization of an expression in a tree of elements and sub-elements is not a natural way for the user to enter or read an expression.
Another approach is to use a specialized graphical editor. For instance some SQL query builders are based on a graphical canvas, where the user can drag and drop tables, check/uncheck a list of columns and connect the objects graphically to define join. Generally, these editors are specialized for a specific type of expressions (for example, such an editor may allow editing a SQL query, but not an update statement). Furthermore, graphical editors often have to rely on free form editors or table editors to define elements of their query (for example, a join condition). Another drawback is that their usage must be learned and is completely different from a text based solution. Skilled users who are familiar with the syntax of the expression languages often have difficulties using such tools.