1. Field of the Invention
The invention relates to the field of multitasking applications targeted at embedded processors.
2. Description of the Prior Art
The functional complexity of embedded software or software which is dedicated to a special purpose continues to rise due to a number of factors such as consumer demand for more functionality, sophisticated user interfaces, seamless operation across multiple communication and computation protocols, need for encryption and security, and so on. Consequently, the development of embedded software poses a major design challenge. At the same time, the elevated level of abstraction provided by a high-level programming paradigm immensely facilitates a short design cycle, fewer design errors, design portability, and intellectual property reuse.
In particular, the concurrent programming paradigm is an ideal model of computation for design of embedded systems, which often encompass inherent concurrency. An embedded system is a special-purpose computer system, which is completely encapsulated by the device it controls. Concurrency is concerned with the sharing of common resources between computations which executed overlapped in time including running in parallel. This often entails finding reliable techniques for coordinating their execution, exchanging data, allocating memory and scheduling processing time in such a way as to minimized response time and maximise throughput. Concurrent systems such as operating systems are designed to operate indefinitely and not terminate unexpectedly.
Furthermore, embedded systems often have stringent performance requirements (e.g., timing, energy, etc.) and, consequently, require a carefully selected and performance tuned embedded processor to meet specified design constraints. In recent years, a plethora of highly customized embedded processors have become available. For example, Tensilica provides a large family of highly customized application-specific embedded processors (a.k.a., the Xtensa). Likewise, ARM and MIPS provide several derivatives of their respective core processors, in an effort to provide to their customers an application-specific solution. These embedded processors ship with cross-compilers and the associated tool chain for application development. A cross compiler is a compiler capable of creating executable code for another platform than the one on which the cross compiler is run. Such a tool is needed when you want to compile code for a platform to which there is no access, or because it is inconvenient or impossible to compile on that platform as is the case with embedded systems.
However, to support a multitasking application development environment, there is a need for an operating system (OS) layer that can support task creation, task synchronization, and task communication. Such OS support is seldom available for each and every variant of the base embedded processor. In part, this is due to the lack of system memory and/or sufficient processor performance (e.g., in the case of microcontrollers such as the Microchip PIC and the Phillips 8051) coupled with the high performance penalty of having a full-fledged OS.
Additionally, manually porting and verifying an OS to every embedded processor available is costly in terms of time and money, and there is no guarantee of correctness. Thus, there exists a gap in technology in relation to creating a multitasking application targeted at a particular embedded processor.
The problem of multitasking support is typically solved using an operating system layer (OS). The OS will maintain information about each task that is running, and will share the processor among the running tasks. Such OS support imposes a performance and memory overheads to the application, usually slowing down the execution. Moreover, the OS infrastructure is generic, designed to perform reasonably well across multiple applications, and must be manually ported to run in different processors. The porting process is long, costly, and could introduce further bugs in the software.
As for automation, there are two approaches that propose solutions for automatically handling the execution of multitasking code. One of them is called a “template-based approach”, where an OS infrastructure is derived from a generic OS only with the constructs needed by the application. It generates a trimmed down OS based on the results of the analysis of the application code. This is a generic approach, which is clearly not the best for embedded systems design.
The second approach is static scheduling. With static scheduling, it is possible to solve the class of problems with a static, a priori known set of tasks. It is an automated solution that generates efficient code. However, the input is restricted, as not all generally used constructs are allowed. Moreover, the set of tasks has to be known beforehand, therefore dynamic tasks are not supported. A task as used here is an execution path through address space. In other words, a set of program instructions is loaded in memory. The address registers have been loaded with the initial address of the program. At the next clock cycle, the CPU will start execution in accord with the program. The sense is that some part of a plan is being accomplished. As long as the program remains in this part of the address space, the task can continue, in principle, indefinitely, unless the program instructions contain a halt, exit, or return. In the computer field, ‘task’ has the sense of a real-time application, as distinguished from process, which takes up space (memory), and execution time.
Finally, the serialization process, i.e. the conversion of an object instance to a data stream of byte values in order to prepare it for transmission, might generate more than one task in the generated code, enforcing the use of extra infrastructures to manage the generated tasks. These extra infrastructures are not automatically generated, and it is up to the designer to manually select and port the one that is judged as more appropriate.
More specifically, there are three categories of prior art approaches that partially address the multitasking problem for embedded processors stated above, namely, a class of virtual machine (VM) based techniques, a class of template based OS generation techniques, and a class of static scheduling techniques. An understanding of each of these will assist in understanding the differences provided by the invention as described in the detail description of the preferred embodiments below.
Consider first, VM based techniques. In the VM based techniques, an OS providing a multitasking execution environment is implemented to run on a virtual processor. A compiler for the VM is used to map the application program onto the VM. The virtual processor is in turn executed on the target processor. Portability here is achieved by porting the VM to the desired target embedded processor. Porting is the adaptation of a piece of software so that it will function in a different computing environment to that for which it was originally written. Porting is usually required because of differences in the central processing unit, operating system interfaces, different hardware, or because of subtle incompatibilities in, or even complete absence of, the programming language used on the target environment.
The advantages of this class of techniques are that the application and OS code do not require recompilation when moving to a different embedded processor. The disadvantage of this class of techniques is the significant performance penalty (i.e., speed, energy, and memory footprint) incurred by the VM layer, and specifically the VM instruction set interpreter. Moreover, the porting of the VM to the target embedded processor may require more than recompilation efforts. Examples of such VM based techniques are Java and C#. Research in this area tries to address the above-mentioned disadvantages by proposing customized VM for embedded applications or just in time (JIT) compilation techniques.
Consider now template based techniques. In the template-based OS generation techniques, a reference OS is used as a template in generating customized derivatives of the OS for particular embedded processors. This class of techniques mainly relies on inclusion or exclusion of OS features depending on application requirements and embedded processor resource availabilities. The disadvantage of this class of techniques is that no single generic OS template can be used in all of the embedded processors available. Instead, for optimal performance, a rather customized OS template must be made available for each different line or family of embedded processor. In addition, for each specific embedded processor within a family, an architecture model must be provided to the generator engine.
In one example, the prior art approach used the SpecC language, a system-level language, as an input to a refinement tool. The refinement tool partitions the SpecC input into application code and OS partitions. The OS partition is subsequently refined to a final implementation. The mechanism used in this refinement is based on matching needed OS functionality against a library of OS functions. In a similar approach, it has been proposed to use a method based on an API providing OS primitives to the application programmer. This OS template is used to realize the subset of the API that is actually used in the application program. An API is an application program interface, a set of routines, protocols, and tools for building software applications. A good API makes it easier to develop a program by providing all the building blocks. A programmer puts the blocks together. Most operating environments, such as MS-Windows, provide an API so that programmers can write applications consistent with the operating environment. Although APIs are designed for programmers, they are ultimately good for users because they guarantee that all programs using a common API will have similar interfaces. This makes it easier for users to learn new programs.
Finally, it has also been proposed to provide an environment for OS generation similar to the previous approaches. Here, a library of OS components that are parameterized is used to synthesize the target OS given a system level description of application program.
Turn now to the category of static scheduling techniques. In the static scheduling based techniques, it is assumed that the application program consists of a static and a priori known set of tasks. Given this assumption, it is possible to compute a static execution schedule, in other words, an interleaved execution order and generate an equivalent monolithic program. The advantage of this class of approaches is that the generated program is application-specific and thus highly efficient. The disadvantage of this class of techniques is that dynamic multitasking is not possible.
In a more specific example, it has been proposed to use a technique that takes as input an extended C code that includes primitives for inter-task communication based on channels or the routes following by the information, as well as primitives for specifying tasks and generates ANSI C code. The mechanism here is to model the static set of tasks using a Petri Net and generate code simulating a correct execution order of the Petri Net. A Petri net, also known as a place/transition net or P/T net, is one of several mathematical representations of discrete distributed systems. One important aspect to note in both prior art approaches is that the generated code could still be multitasking, thus requiring the existence of an OS layer that can schedule and manage the generated tasks.
Embedded software is characterized by a set of concurrent, deadline-driven, synchronized, and communicating tasks. Hence, embedded software is best captured using the real-time concurrent programming model. Therefore, there exists a gap between the desired programming abstractions (i.e., real-time concurrent programming model) and the default embedded platform programming abstractions (i.e., sequential programming model supported by an optimizing compiler from an embedded processor core vendor). The support for real time concurrent programming is usually provided by a real time operating system (RTOS). The RTOS is a software layer that runs between the user-level tasks and the embedded processor, controlling task execution, timing constraints, and access to devices, in addition to providing synchronization and communication facilities. Some commercially available RTOSs include eCos, VxWorks, and microC/OS.
In general, an RTOS is built as a generic framework which can be used across a large number of processors and applications. An RTOS provides coarse grained timing support, and is loosely coupled to the running tasks. As a results, an RTOS, in terms of resource usage efficiency and performance, is seldom optimized for any particular application. Additionally, the heavy-weight nature of an RTOS prohibits its use in applications where the underlying hardware platform is based on low-end microcontrollers.
Instead of relying on a “one-size-fits-all” template, what is needed is some kind of solution, which is able to optimize execution and resource usage.