At the simplest level, an operating system not only manages the hardware and software resources of a computer system, but the operating system also provides a stable, consistent way for applications to deal with the hardware without having to know all the details of the hardware. The first task, managing the hardware and software resources, is important because as computer processor frequencies and as bus frequencies and traffic increases, many programs and input methods compete for the attention of the central processing unit (CPU) and demand memory, storage and input/output (I/O) bandwidth for their own purposes. In this capacity, the operating system manages the processor so each application gets its necessary resources, ensures all applications are compatible, and, at the same time, allocates the limited capacity of the system to the greatest good of all the users and applications.
The second task, providing a consistent application interface, is especially important if there is more than one type of computer using the operating system or if the hardware of the computer ever changes. A consistent application program interface (API) allows a software developer to write an application on one computer and be confident that it will run on another computer of the same type, even if the quantity of memory is different on the two machines. An operating system can ensure that applications continue to run when hardware upgrades and updates occur because the operating system, not the application, manages the hardware and the distribution of its resources.
Operating systems may be categorized into four types based on the computers they control and the supported applications. The broad categories are: (1) real-time operating system (RTOS); (2) single-user single-task operating systems; (3) single-user multitasking operating system; and (4) multi-user operating system. Real time operating systems (RTOS) are specifically designed to manage the computer so that a particular operation executes in real-time in precisely the same way every time. RTOSs are grouped according to an acceptable response time, whether it be seconds, milliseconds, microseconds, and according to whether or not failure of the system can cause death. RTOS systems are used to control medical systems, machinery, scientific instruments and industrial systems. An RTOS typically has little or no user-interface capability and no end-user utilities; a RTOS will be a sealed box when delivered for use. Embedded systems are combinations of processors and special software inside another device, such as the electronic ignition system on cars.
As the name implies, the next category of operating systems, the single-user, single task operating system is much smaller and less capable to fit into the limited memory of handheld device and manage resources so that one user can effectively do one thing at a time. The Palm OS for Palm handheld computers is a good example of a modem single-user, single-task operating system, also called handheld operating systems.
A single-user, multitasking operating system is what most people use on their desktop and laptop computers today. OS/2, Windows, and the MacOS are examples of an operating system that allow a single user to have several programs in operation at the same time. Workstations are more powerful versions of personal computers and while only one person typically uses a particular workstation, workstations often run a more powerful version of a desktop operating system and often have software associated with larger computer systems because of the more powerful hardware. Often times, software programmers will install an integrated development environment on a workstation to develop new applications.
A multi-user operating system allows many different users to simultaneously take advantage of one computer's resources. Servers are computers or groups of computers used for internet serving, intranet serving, print serving, file serving, and/or application serving. Servers are also sometimes used as mainframe replacements. The multi-user operating system for these servers and other mainframe computers must balance the requirements of the various users are balanced to ensure that each program and each computer, called clients, accessing the server have sufficient and separate resources so that a problem with one user doesn't affect the entire community of users.
One way an operating system knows about the data in the computer is through the use of metadata. Metadata is data about data and is distinct from the data itself. Within the computer industry, the most common domain of metadata is the file system. Files contain data which has associated metadata. Metadata may be immutable or independent, and metadata may further be essential or nonessential. Immutable metadata changes only when the data itself changes. Independent metadata may change regardless of whether the data in the file also changes; for instance, changing the number of permissible users or changing a file's location does not necessarily change the actual data within the file. Creation date and possibly the last access data, assuming read-only access, also may change without changing the data. Independent metadata is the most common type of metadata, though not necessarily the most important type of metadata. Essential metadata is required to access a file, vis a vis a file's name, location, and size are essential metadata because the file cannot be used without them. Nonessential metadata is that metadata which is not necessary to access the file, i.e., a file can exist in a traditional hierarchical file system in a useful manner without these pieces of metadata. Examples of nonessential metadata are the file's dates and the permissions or permissible users.
The operating system may manipulate a file's contents if it can find and access the file's name and the file's location. This essential metadata may be a combination of the host, disk, and directory structure where the file is located. A file may be uniquely selected by combining the file's name and the file's location into a single identifier, often called the path to the file. In a hierarchical file system, the combination of the file's name and location is the file's identifier. File names and locations vary only length and possibly encoding, whether in American National Standard Code for Information Interchange (ASCII), Unicode, MacRoman, Extended Binary-coded Decimal Interchange Code (EBCDIC), etc.
The file has a size, even if it is zero, and the size of the file is essential metadata. File size is stored in sizes of memory, i.e., blocks, bytes, or bits. The extent of the file is often stored in the basic file system structures in the form of the starting and ending points of the file plus the path from one to the other usually in the form of pointers between blocks of memory. Implementations vary, however, and at times the file size metadata is stored in a distinct location. Other implementations may store the end points.
File dates are nonessential metadata; there are many dates to associate with a file: creation date, date of last data modification, date of last metadata modification, date of last data access, etc. Modification date is immutable metadata.
The nature of the file's data is immutable metadata and refers to the file's content type or type, such as whether the file is executable, image, audio, video, text, etc. and/or specific file formats such as JPEG, AIFF, MPEG2, Microsoft Word, etc., or even very specific versions of particular formats. A file's type, by definition, cannot change unless the data itself also changes. One case where immutable metadata may change without requiring a change to the data itself is a change, either an increase or decrease of the file's accuracy. For example, a file may have associated file type metadata that identifies it as a GIF image. At some point in the future, it may become known that the file is actually an interlaced GIF89a. Similarly, a file's modification date may be increased to millisecond accuracy. File type metadata is nonessential if the file's data can be retrieved and stored without knowledge of the file type metadata.
Early operating systems displayed file type metadata exactly as it was stored, such as with a handful of characters like TXT or COM. Remembering that TXT means text file was easy. Displaying file type metadata as stored, moreover, saved memory and effort from the CPU and the programmer. This file type metadata storage remained distinct but the information was displayed in its raw form. Subsequent operating systems incorporated file type metadata within the file identifier. In order to specify a file completely, it was necessary to provide the file's location, its name, and its type which meant that several files could share the same name and location, provided they had different types. Thus, file name extensions were born. Although the application itself may only need the file's data, choosing which application to use depends on the file's type, format, content type. For instance, the file may be an image file or an audio file and the user may be required to select the appropriate application herself, by e.g. opening the file from within an application. Broad file types like image or audio are useful for organizational purposes, but when it comes down to an application reading a file's data and correctly interpreting it, more specific file type metadata such as JPEG or WAV becomes necessary. Thus, not only the file type but the particular application must be available to the user if she/he is going to open and/or edit the file. The process of choosing which application to use to manipulate a particular file, called application binding, can be handled by the operating system. The user simply indicates his desire to open a file by double-clicking the file and the operating system looks at the file's type and chooses an appropriate application.
Metadata also includes file permissions which inform the operating system and the user who can read this file, who can write to the file, who can execute the file, etc. Permissions and ownership metadata are nonessential and are determined by the security model of the operating system: user/group id numbers, permission bit masks, access control lists, etc. Permissions are usually stored on file systems that are meant to be used with networked and/or multi-user operating systems. Because file date storage is so common, there is almost always a logical home for permissions to be stored with the file dates in the dedicated metadata structures of the file system. File ownership usually accompanies file permissions. Unix, for example, traditionally regulates file access by assigning rights to the file's owner, the file's group, and everyone else. In such an implementation, the permissions metadata is useless without the owner and group metadata.
Some immutable metadata, e.g., size, are woven into the fabric of the file system whereas other metadata, e.g., modification date, may be stored in the dedicated metadata structures of the file system. Independent, non-essential metadata such as file permissions, creation date, etc. have also been stored in the dedicated metadata area. In the earliest implementations of file systems that stored file type metadata, metadata was stored, like all other metadata, in a distinct, but usually very small file system structure.
Like all forms of information, metadata is easy to remove or ignore, but it is often difficult or impossible to add once it is lost. If a user no longer knows when a file was last modified, she/he cannot recover that piece of information despite the fact that the modification date is immutable metadata completely tied to the data itself. The data itself remains, but the information about the data is lost. To truly lose file size metadata, the file's extent must be lost. Thus, the extent combined with the traversal path is the actual storage mechanism for the size metadata.
The first step in any implementation of metadata is to decide how the metadata will be stored. A file's location may be stored in a distributed hierarchical manner, with each directory storing a list of all the items it contains. In order to access the file in a hierarchical file system, a user must already know the location of the file. From that point, a user may drill down the directory tree or drill up to the directory path that leads from the file to the file system root. In most common file system implementations, you must already know a file's name in order to read that piece of metadata.
Different from hierarchical file systems which use two pieces of information, name and location, as an identifier, nonhierarchical file and database designs use a single value by which a row in a table can be uniquely selected. This concept of a single, unique identifier is common in the world of relational databases and nonhierarchical file systems.
The core operating system functions, i.e., the management of the computer system, lie in the kernel of the operating system. The display manager is separate, though it may be inextricably tied to the kernel. The ties between the operating-system kernel and the user interface, utilities and other software define many of the differences in operating systems. Application program interfaces (APIs) let application programmers use functions of the computer and operating system without having to directly keep track of all the details in the CPU's operation. For example, if a user is permitted to specify the name of a newly created file, the operating system might provide an API function named MakeFile for creating files. When writing the program, the programmer would insert the instruction MakeFile [1, %Name, 2]. This instruction tells the operating system to create a file that allows random access to its data 1, has the name %Name entered by the user, and has a size 2 that varies with the data is stored in the file. The operating system sends a query to the disk drive to get the location of the first available free storage location and creates an entry in the file system of the file's metadata, i.e., the beginning and ending locations of the file, the name of the file, the file type, whether the file has been archived, which users have permission to look at or modify the file, and the date and time of the file's creation, etc. The operating system writes the file identifier at the beginning of the file, sets up the permissions, and includes other information that ties the file to the application. In all of this information, the queries to the disk drive and addresses of the beginning and ending point of the file are in formats heavily dependent on the manufacturer and model of the disk drive but because of the API for disk storage, the programmer need not know the instruction codes, data types, and response codes for every possible hard disk and tape drive. The operating system, connected to drivers for the various hardware subsystems, manages the changing details of the hardware; the programmer simply writes code for the API and trusts the operating system to do the rest.
Just as the API provides a consistent way for applications to use the resources of the computer system, a user interface (UI) brings structure to the interaction between a user and the computer. In the last decade, almost all development in user interfaces has been in the area of the graphical user interface (GUI). There are other user interfaces, some graphical and some not, for other operating systems. Unix, for example, has user interfaces called shells that are more flexible and powerful than the standard operating system text-based interface. Programs such as the Korn Shell and the C Shell are text-based interfaces that add important utilities but their main purpose is to make it easier for the user to manipulate the functions of the operating system. There are also graphical user interfaces, such as X-Windows and Gnome, that make Unix and Linux more like Windows and Macintosh computers from the user's point of view. It's important to remember that in all of these examples, the user interface is a program or set of programs that sits as a layer above the operating system itself.
While some definitions have been presented in context herein, a tutorial in additional definitions may be helpful. An application is a software program used by an end user; examples of applications include a scheduling client program or application wherein a person may schedule employees' work days; a word processing application; a presentation application to prepare slides for a talk; a database application in which to manipulate data; a spreadsheet application, etc. A tool is a software application that enables a software developer to write additional applications. Examples of tools include: a remote-accessing tool; a database tool to access and manipulate remote relational database tables, columns and rows; a message queue tool to access and manipulate remote message queues; an import tool to select files on a remote system for importing into an ongoing software development project; a performance tool to access and configure remote performance; a tracing tool to trace execution of remote performance, a file tool to access folders and files in the file system of a remote system, etc. A component is software code that can be reused across multiple applications; in other words, a component is standard software that can be pulled off a server and incorporated into new applications using a tool by software developers. For example, a calendar component may be used in several applications such as a scheduling application, a presentation application, a data base application to calculate employee's vacation and pay, etc. Thus, a software developer uses tools to pull components from a local or remote server to create applications.
Software developers found it was first convenient and then necessary to have all code generation tools under one umbrella, called an integrated development environment (IDE). Integrated development environments, as the name suggests, give the software engineer an environment wherein the appropriate tools needed for source code editing, compiling, linking, testing, debugging, and profiling are seamlessly integrated. The advantage of using an integrated development environment is that the software developer need not be concerned about the tool interfaces when moving from one phase of code development to the other. Typically the integrated development environment tracks the phase of code generation and invokes the necessary tool. Currently, Eclipse, one integrated development environment, provides edit support for local files that exist in the user's workspace. For programmers, however, who develop programs for remote servers, there is a need to be able to access files that may not exist locally on their machine. In a client/server environment, software developers need to edit source code in real-time wherein that code very often resides on remote machines. In other words, software developers want to open, edit, and save remote files as if those files existed on their local machine, without having to manually transfer files between their workstation and the server. For computer software programmers using an IDE such as Eclipse and for persons writing IDE tools for application development on different operating systems, there is a need to remotely access, query, and/or manipulate resources on nonhierarchical operating systems using an IDE on a hierarchical operating system for development tasks.
Thus, there is a need within the software development industry to access and transfer resources on remote servers and other computers across a network. The remote servers and other computers, moreover, may have different operating systems and file structures, for instance, a client upon which an IDE, such as Eclipse, is installed, has a hierarchical file system using ASCII to represent and store characters whereas the server or large mainframe for which applications are being written, may have a nonhierarchical file system using, for instance, EBCDIC to represent and store characters. The differences between the two file systems and binary code representations force users and tool writers to maintain source code on the server or the large mainframe system.
Software configuration management encompasses the techniques of initiating, evaluating, and controlling change to software products, during and after the development process. Thus, software configuration management is an integral part of the software development process across all phases of the software's life cycle. A partial list of software configuration management chores include the identification, change reporting and evaluation, change execution, tool evaluation and use, version control, and management principles relation to configuration control. Software configuration management (SCM) repositories can be used to store, version and manage the resources and projects. Right now, each software configuration management product requires a specific adaptor.
There is thus a need in the industry to allow any hierarchical based tool to accurately accommodate nonhierarchical operating system source files. There is a further need in the industry to have a single software configuration manager which manages code for both hierarchical and nonhierarchical operating systems. Thus, given an IDE, the software configuration management should provide only one repository that runs on the server operating system to manage the development of software for that server.