The present invention provides the ability to track, audit, verify and manage, or control, application level data produced by various types of software applications in common use today. While the primary focus of this application (and the most common usage) is centered on verification and tracking of data directly related to spreadsheet applications, it should be apparent to one of ordinary skill in the art that this same approach may be utilized or extended to any application data input/output process. This technique is generally described to be useful where the users, management or administrators of the software application or business process desire to provide more control over the data creation and output utilization process.
Spreadsheets are a type of software application (or application process, engine or virtual machine) that use a grid like metaphor (columns and rows) to organize, arrange and process data and algorithms (for example, formulae).
Spreadsheets make it easy to manipulate and display information by allowing people to create formulas which depend on and operate on the data to generate new or resulting data values.
For example, a graphical spreadsheet program may have a visual toolbar area on the screen which contains a particular user interface icon that symbolizes inserting a formula into a cell (the intersection of a column and row) which will sum up numbers that are contained within a given or specified range of cells within a specified spreadsheet.
A modern spreadsheet file consists of multiple worksheets that make up one workbook, with each file being one workbook. A cell on one sheet is capable of referencing cells on other, different sheets, whether within the same workbook or even, in some cases, from within different workbooks.
A spreadsheet application is typically one of four or five main component applications of an office productivity suite (such as OpenOffice or Microsoft Office). Such suites group a spreadsheet application (such as OpenOffice Calc or Microsoft Office Excel) with a word processor, a presentation program, and a database management system (and, optionally, various other related applications) into a solution stack that aids the productivity of most office work; from administrative tasks, to sales, manufacturing, warehousing, engineering, R&D, accounting and general management among other business work or job roles.
However, usage of spreadsheets is not limited to business environments, as general users or ‘home’ users also make extensive use of spreadsheets to manage and track personal data such as budgets, lists of names and addresses as well as a variety of other models for financial, personal or general productivity tasks that they wish to manage and automate by using spreadsheets.
In short, given their ubiquitous and universally accepted utility in both the business and personal, home, domain, spreadsheets are generally recognized as being the most powerful, flexible and capable type software application available to all software users. Historically, spreadsheets developed as computerized simulations of paper accounting worksheets. They immediately boosted productivity because of their ability to accurately and reliably re-calculate an entire worksheet automatically after a change to a single cell is made. (This was a manual and error prone process in the days of paper ledgers). Spreadsheets have now replaced paper-based systems throughout the business world and thus created a huge competitive disadvantage in any organization or user that suffered from an inability to work with spreadsheet applications. Moreover, although spreadsheets were first developed for accounting or bookkeeping tasks, they now are used extensively in any context where tabular lists are built, sorted, manipulated and shared and also in any type of business such as, but not limited to the financial services industry, manufacturing, retail and healthcare
The extensive user adoption of the spreadsheet and spreadsheet application has been found in academia, science, research & development, within marketing and sales organizations as well as most other general business tasks including those undertaken by small and medium business users, home users, groups and individuals who wish to track, organize and manage their own data.
The first electronic spreadsheet on a PC or microcomputer was called ‘VisiCalc’. By 1984 Lotus 1-2-3 was the leading spreadsheet application (when DOS was the dominant operating system) that was based on and built for the IBM PC standard. By the early 1990's, however Microsoft's Excel had overtaken Lotus 1-2-3 and earned the largest market share on the Windows and Macintosh platforms; a market dominance which continues to the present day.
One of the unique ways in which Microsoft overtook and surpassed Lotus as the dominate spreadsheet product was their focus on data file migration and macro compatibility when moving users from Lotus to Excel. Allowing Microsoft Excel to read and write (or save) Lotus 1-2-3 files allowed Lotus users to seamlessly move to or from Excel. Further, because of Excel's support of the Lotus 1-2-3 spreadsheet files, the Microsoft Excel application was able to ensure interoperability within and between groups of users who shared common applications or financial and other business models built within these spreadsheets. Additionally, by supporting the ability to run or interpret Lotus macros within Excel, Microsoft was able to support the complex financial and business models built within Lotus 1-2-3 spreadsheets and eliminate most user objections to moving from Lotus 1-2-3 to a different spreadsheet application such as Microsoft Excel.
These same types of problems around interoperability and ease of use are now being faced by web based application vendors as they attempt to gain market share from the established PC based office applications, and specifically in respect to spreadsheet applications. This is most notable with the transition of traditional PC based Microsoft Excel application users to the use of newer cloud computing based application models.
Since the advent of web services and web applications, office productivity suites now exist in web-application form, with Google Docs and Microsoft Office 365 being the biggest competitors in this space. Thus, Google spreadsheets, (part of the Google Docs productivity suite) now shares the spreadsheet market with Excel. However, as ‘cloud’ based computing gradually replaces desktop centric computing, the unique benefits of ‘on PC’ or the local (PC) based processing of application data and functions remain very much in demand. The ability to rapidly process and manipulate large volumes of data and functions is both needed and heavily utilized by spreadsheet applications. As such desktop (PC) based spreadsheet applications provide a reliable and trusted solution to the processing needs of all spreadsheet users—a solution that cannot easily be replicated and/or exploited via cloud based solutions. Given the demand for both web based cloud office productivity applications and traditional PC based office productivity applications, spreadsheet application users who want the best of both the cloud and PC (desktop) world are now waiting for a new type of solution.
This new type of solution must provide seamless interoperability between cloud based spreadsheet applications and PC based spreadsheet applications as well as the ability to support the complex modeling and processing needs of spreadsheets users. This new type of solution must therefore address the problem of user and data file migration between the desktop and cloud based spreadsheets and thus the ability to support all the data and/or content stored in those files as well as all the complexity found in the models built within the files.
In addition, given the emergent nature of the cloud computing web based application model, there now exists a need to better understand how to effectively meet the needs of the traditional, desktop based, spreadsheet user in this new cloud based environment. The design and delivery of both cloud and desktop based spreadsheet applications would therefore benefit from a deeper understanding or ‘instrumentation’ of desktop spreadsheet user behavior and how that user behavior would change when users ‘interoperate’ with a web cloud based spreadsheet application.
Beyond their mathematical prowess, spreadsheets allow information (data values, data records or rows) to be sorted, organized, filtered and reported on in a manner similar to database software. While spreadsheets share many principles and traits of databases, they are not the same thing. A spreadsheet is essentially just one table, whereas a database is a collection of many tables with machine-readable semantic relationships between them. Spreadsheets are often imported into databases to become tables within them. While it is true that a workbook that contains two or more worksheets is indeed a file containing multiple tables that can interact with each other, it lacks the relational structure of a database and may also lack the enormous storage capacity of an enterprise database application. Thus spreadsheets may store data but are more adept at utilizing or leveraging that data by storing it into cells along with associated algorithms such as, but not limited to, formulas and linked lookup tables.
PC based spreadsheets share many application features or functional commonalities with other PC software based products such as word processing, databases and presentation applications. Commonly, these applications, including spreadsheets, store their internal application data or data structures in computer file system structures on disk drives, whether on the local PC drive(s) or the local area network (LAN) or the Internet/web-services/cloud services type infrastructure platforms. Regardless of where the file is stored, the basic concept is that the software application has a structured or well-defined proprietary file format that it uses to store on disk the internal data structures, values and elements that it needs to create the software product application ‘file’ in memory as the user works with the application. When the application starts, it reads the file structure to load relevant and necessary data into working memory and then allows the user to interact with the functionality it provides (document editing, creation, calculation, etc.) and as requested or configured will store either temporary/intermediate results of this user interaction and or the final/permanent results to the file on a file or disk structure. Thus, spreadsheets can have a proprietary file format that stores the grid data and or content. Moreover, this static ‘document centric’ or file and folder model of working with spreadsheet files has been the dominant way users interacted with their spreadsheets.
Additionally, most PC apps can read and write alternative file formats including formats defined by their competitors as well as industry standard or common data interchange data formats such as CSV or XML among others. These common formats may or may not lose application ‘fidelity’ (that is the unique features, formatting or metadata knowledge that the creating application knows about the file) when the file is expressed or written to disk in a competitors format or in an industry standard format. Any loss of ‘fidelity’ is typically acceptable as users are only seeking to move the ‘data’ values and not the formatting from one app to another. The concept of data exchange and interchange is well known in the PC software industry and there are many standards bodies and industry working groups among many other initiatives that frequently seek to define a standard file format and to encourage data exchange or interoperability between software applications.
A unique source of well recognized challenges in the management and control of the use of spreadsheets is created by the powerful native or built in functionality available to spreadsheet users providing them with the ability to literally build applications within their spreadsheets. These applications are able to perform discrete and potentially complex business or personal processes that regularly link data values and formulas across sheets or files, achieve complex data retrieval and sorting capabilities and often include macro level programs built from functionality available in modern PC based spreadsheet applications. This ability for users and businesses of all types to build and rely upon these applications (sometimes also referred to as ‘models’) constructed using the functionality available within a spreadsheet application has generated a dependency and a heavy usage of spreadsheets in the daily activities and operations of spreadsheet users and businesses. Moreover the ‘linking’ of cells and content between discrete spreadsheet files often found, for example, when a user is attempting to consolidate data held in one or more related spreadsheet files has introduced additional problems for users to address. Examples of such problems include subtle changes to data constants or calculation errors which may be introduced by users who are working with multiple “copies” of the same spreadsheet file; different “versions” or flavors of the same spreadsheet, including older versions of the spreadsheet being mistaken for the “current” or latest version; as well as the inability of administrators or professional auditors to track and consistently know who made the latest changes to a spreadsheet and how those changes have rippled throughout any linked spreadsheets or groups or sets of spreadsheets. In this respect the difficulty in accurately determining who made the last change and what was changed within any version or snapshot of a spreadsheet is a critical problem to many spreadsheet users; in particular for finance and accounting users as well as business and financial process auditors, all of whom depend on knowing what has been changed within a spreadsheet and all of its related or linked spreadsheets before they can safely make business decision and or “certify” these specific or related spreadsheets. Without being able to know who did what, when, where and possibly why, these professionals are taking on great risk in providing letters or reports stating the accuracy of their auditing and tracking work, specifically for any financial reports that are generated by linking two or more spreadsheets together. This lack of transparency or data change visibility is one of the various problems solved by the current invention.
Another challenge currently facing spreadsheet application vendors is the growth trend of users taking advantage of the development tool-like capabilities within spreadsheets to build complex mathematical, financial or business models including programming applications hosted within the spreadsheet. This usage creates the conditions for limitless complexity when systems desire to track both the spreadsheet files, the changes made to each spreadsheet as well as the impact of those changes. The impact of these changes may range from minor formatting errors, to changed, deleted or missing data or formulas, including introducing subtle errors over time, whether minor or major errors, into the models embedded within the sheets, and thus the decisions made by the users based on those models may be impacted greatly. This complexity has been driven by both the extensive linking of sheets together via cell references as well as data value exchange included in the programmatic development of these models within the spreadsheet. Further contributing to the growth of complexity is the decentralized and ubiquitous usage and distribution of spreadsheet files that are being shared via email, via USB or ‘thumb-drives’ and even cloud based collaboration services in various forms. As users find more ways to share their spreadsheets, more copies of the file are made outside of any controlled system or process and thus the individual PC based copies of Excel, for example, cannot track who is changing what within spreadsheet cells nor can they synchronize those changes across all users and all versions (or copies) of the spreadsheet file. This lack of transparency is potentially compounded yet again by the inability to track changes and the impact of those changes when dealing with “N” tiers or levels of ‘nested’ spreadsheets, (meaning two or more discrete levels that make up a typical parent/child hierarchy model) which may be linked together across any level by, for example, cell level formulas, cell references to other data sets and or sources or even shared data values. Collaboration by various members of a team that continuously create slightly and subtly modified versions of these nested spreadsheets utilizing the exact same file name and thus overwriting prior changes made by various members of the same further complicates the challenge of spreadsheet management and control.
Thus in order to audit the sophisticated and complex business applications which have been developed within spreadsheets using modern spreadsheet applications, a new set of challenges are created for those who want to know who changed what, when and where within a spreadsheet. Moreover the recent introduction of web based collaboration and file or data sharing technologies via cloud services has not offered any new solutions to the problem of managing ‘N’ Tiers of related spreadsheets across multiple users, but rather has introduced additional opportunities for error, version mismanagement and loss of data integrity.
The familiar, powerful and easily scalable spreadsheet usage supported by established business processes and models as well as web based collaboration systems combined with the weak capabilities of file based spreadsheets applications to record and track changes to versions across copies stored outside of the control of any centralized process makes the job of auditing, reporting and certifying any spreadsheet based business models complex and error prone given the current state of technology in use. Further the tendency of spreadsheet applications to leverage the local PC processing resources rather than that offered by web based spreadsheet applications means that desktop or PC based spreadsheets may continue to be the pre-dominant or preferred type of spreadsheet application product versus cloud-based spreadsheets. Similarly, web based spreadsheets may be viewed as being generally limited to use by casual, single table or limited sheet user scenarios. Thus the large and/or the complex and/or the highly linked spreadsheets will generally still depend on the highly accessible and comparatively powerful processing capabilities and functionality provided by local PC software products like Microsoft Excel.
The use of application data files is an established method to persisting application functionality between user sessions, between users as well as between software applications or systems. Various software products utilize the ability to read/write or store the data contained or created within the software application as a ‘file’ structure on an operating system file system or disk drive (or virtual drive). Many products utilize proprietary or internal only formats as their native file structure so that the file format that one software application uses, such as spreadsheets, is different from what a word processing application would utilize to store its relevant application and or user created data.
Beyond the native applications ability to store/read/write application data, the common PC desktop software applications may be commonly connected to, or integrated with, another category of software applications to enhance or extend the ability to manage the creation, processing, searching, printing, reporting, management, etc. of end user application data files via a third party centralized database application or system. These are typically classed as ‘Enterprise type software systems’ and are generally tied into or ‘wrapped’ around both the PC software application on the desktop as well as the existing internal file storage system (local disks or LAN/internal storage, SANs or networked attached storage, including cloud or big data storage models) supported by the desktop software or cloud application. These enterprise applications are generally known as ‘Enterprise Document Management’ (EDM) applications and they seek to provide enhanced value such as both file and content searching, storing multiple copies of files, reporting on file usage and content, as well as control or centralized management over both users and their data files among other features such as enhancements to the basic software application functionality including enhanced access security.
One good example in this category is the use of ‘document management’ software by professional or office users (such as lawyers) to manage word processing documents such as letters, contracts and memos or reports. By utilizing the EDM application to centralize the storage and control of the word processing files, the business or organization can more easily manage the cost and operation of creation, editing, printing and reporting as well as security and the control of the dissemination of the content of the word processing documents.
The idea of using an application to centrally manage ‘files’, whether word processing or spreadsheets or presentation graphics or the like, is well known in the industry, as is the use of newer web-centric hosted portals or internet web service applications which have been developed to extend this model of internal LAN management of documents to the newer internet models. One of the most common examples in this new internet web-centric EDM model are Microsoft's SharePoint products which utilize a webserver (either on the local area network or a web service model across the Internet) to manage and control access to Microsoft Office documents or static files including word processing, spreadsheets or presentations, among other file types.
The central theme in these types of system architectures is that the application is storing the ‘file’ as it exits from the creating application such as Excel and then SharePoint, or another similar application, will add some value added services around the existing file storage model, such as file access security, reporting and some level of control over a simple file based versioning system (i.e. there may be a version 1, version 2, version 3 up to version ‘X’ of the same file or similar but slightly modified version of a file stored as ‘X’ separate files in the file system). In this respect, the EDM application is typically managing access to the underlying file system via metadata stored in its internal database as users are required to open and close files using the EDM's file management user interface (UI) and not the native application UI for saving or opening files. Additionally, when EDM applications are integrated into the native application's file management UI, they typically redirect the standard or normal file open or save application programming interface (API) to read or write the proprietary format to the file storage system that is managed and controlled by the EDM.
It is important to also note that when reading or writing the proprietary format to the file storage system that is managed and controlled by the EDM, the EDM application will do this without typically altering the content of the native office application file structure. The EDM application will instead capture metadata about the file (which user is editing/changing the file, data/time of the change, file name, etc. among other properties, attributes or metadata values among other aspects tracked or managed by the EDM systems) and store that metadata into the EDM application database. The EDM typically leaves the office document file untouched and stores it outside of the EDM metadata database system. Thus the EDM app only knows what the creating application tells it about the changes made to the file or what it can see or find by inspecting the file format directly using its own internal search or storage methods.
This approach to managing documents or files by Enterprise Document Management systems and, as a function of that approach, to understanding what changes have been made to the stored document or file has however a limited knowledge of, for example, what changes may have occurred within any single cell of a spreadsheet. This could include, but not limited to, a complete understanding or knowledge of how the interactions between and among unrelated or related cells has changed. Additionally, EDM applications may have almost no capability (and therefore the ability to record the relevant knowledge) to see, track or inspect how related or linked sets of separate files have been modified, so that they interact differently either by intentional editing or unintentional side-effects or unintended editing among other visibility limitations or weaknesses.
It is the weakness of the single file inspection method, as well as the lack of knowledge available by searching the metadata, or the lack of knowledge available to the application by using metadata semantics to understand the relationship between cells, sheets and workbooks, that limits the EDM type application from accurately tracking changes across, between and among multiple files, particularly spreadsheets.
This lack of sophistication, depth of tracking and overall lack of support for ‘N’ tier hierarchical linking between spreadsheet type documents, for example, introduces the errors, intentional or not, which creates the critical problem of management uncertainty and doubt that is one of the focus areas of the present invention. In addition to this lack of detailed tracking of changes within standalone as well as hierarchically linked spreadsheets, the EDM architecture and model does not address the management of complex applications that can be constructed within the spreadsheet, as the EDM application may not have knowledge of the interaction between language statements or functions within the spreadsheet built application as well as the runtime behavior or interaction between spreadsheet and non-spreadsheet files, including, but not limited to the dependencies between files.
While spreadsheets (and the business applications performed by the spreadsheet and or groups of related (linked) spreadsheets) may contain one or more sophisticated algorithms built using formulas, data models, macros and even embedded programming languages, they have traditionally not been considered as full blown modern programming ‘languages’ or programming development environments. Yet, given how spreadsheets can be used to build complex and sophisticated solutions to many different types of critical business processes they are regularly used to automate, they should be viewed in that manner. Moreover the spreadsheet applications should be viewed as possessing the same general capabilities of a programming language and or a programming development environment when the spreadsheets application native capabilities is combined with the spreadsheet users ability to build and successfully execute business process automation. This is particularly apparent when the building and running of spreadsheet applications is regularly reliant upon such fundamental programming models/concepts as the ability to leverage native functionality to ‘call’ and return data or values from one or more sets of spreadsheet cells located within one or more separate spreadsheets.
Further, other complex document and or business process automation models built using other office productivity applications should be viewed in the same manner.
While spreadsheets may lack many of the more ‘object-oriented’ (OO) functions of modern programming languages such as C++, Java, PHP, Ruby or Microsoft's .NET based languages, they do have incredible power to treat each cell as an entire function in and of itself or as a sub-program. Thus entire worksheets, rows of cells or ranges of cells may be viewed to be complete applications rivaling traditional executable applications. Additionally, while OO capabilities are typically expressed in source code language features, the spreadsheet may expose native functions, macros or even languages such as Visual Basic for Applications (VBA) as embedded code or values within a cell. These functional expressions within one or more cells may be linked together at runtime to create full blown programs running within the context of the spreadsheet application object. Finally, the spreadsheet or other business applications may expose parts of their functionality via OO constructs such as classes or functions that are callable within a single spreadsheet cell. So while OO languages are stored as ASCII text files, when they are ‘compiled’ or executed they act as standalone objects, much like each cell within a spreadsheet may contain language statements and/or expose their functionality as an object callable by other cells or applications. Thus, while C++ may be used to create Microsoft's Excel, a single spreadsheet may be developed which allows Excel to expose the embedded spreadsheet code as a pseudo type function, object like class or even a standalone application, and as a consequence one spreadsheet application may call on or link to or embed objects from multiple spreadsheet files or other software applications outside of the spreadsheet engine.
Thus, to the business user, the spreadsheet may be viewed as being “the application” and the spreadsheet's UI may be the only user interface with which they interact with the functionality of the business application that they use to perform a job or task.
Thus, in some ways, the Excel ‘runtime’ may be viewed to perform a similar function to that of the Java runtime machine in hosting and running a Java source file or Java application. Therefore, from the point of view of the spreadsheet application builder, users and administrators among others, applying existing software development tools and techniques would seem appropriate to improve their usage and management of spreadsheets. Yet to date, these traditional software tools, methods or systems have not been applied to spreadsheet files or the management of their embedded content or applications.
What distinguishes OO languages from traditional ‘linear’ or modular languages (i.e. non-OO languages) is the complexity of interaction and inter-relation, linking and nesting of object levels between and among the source code language, it's native constructs and operators as well as the operational characteristics of the compiler and the operational or ‘runtime’ processes which host or run the software built with OO paradigms.
Object-oriented programming (OOP) is a programming paradigm using “objects”—data structures consisting of data fields and methods together with their interactions—to design applications and computer programs. Programming techniques may include features such as data abstraction, encapsulation, messaging, modularity, polymorphism, and inheritance. Many modern programming languages now support OOP. Applications may be built using various macros or programming languages such as the Visual Basic language embedded within Microsoft's Excel spreadsheet application, in order to give their users access to the powerful capabilities provided by these development tool options.
While simple, non-OOP programs may be perceived to be one “long” list of statements (or commands), OO applications are built around smaller chunks of code which encapsulate both data and the code to operate on the data—much like modern spreadsheets may do within their cell and matrix format. Traditional non-OO languages group more complex programs into smaller sections of language statements or functions or subroutines, each of which might perform a particular task. With linear designs of this sort, it is common for some of the program's data to be ‘global’, i.e. accessible from any part of the program. As programs grow in size and complexity, allowing any function to modify any piece of data, errors or bugs can have wide-reaching effects. In contrast, the object-oriented approach encourages the programmer to place data where it is not directly accessible by the rest of the program. Instead, the data is accessed by calling specially written functions, commonly called methods, which are either bundled in with the data or inherited from “class objects.” These objects act as the intermediaries for retrieving or modifying the data they control. The programming construct that combines data with a set of methods for accessing and managing those data is called an object. The practice of using subroutines to examine or modify certain kinds of data, however, was also quite commonly used in non-OOP modular programming, well before the widespread use of object-oriented programming.
Regardless of language properties, constructs or architecture, both OO and non-OO languages are created and generally stored as simple text files following the semantics and rules of that particular programming language.
In addition to utilizing the features and attributes of programming languages used to build software applications, a traditional software source code language file may include other types of content. One example of this content are the commands or directives used to include or link one file to another language file or some other type of file required to compile or build the software application. Another type of content that is critical in managing software complexity is the use of naming conventions for variables, functions and constants among other values. These naming conventions allow other programmers to understand what the original author's intent was or to document what type of behavior occurs in the program or what type of attributes a variable or value should contain among the many motivations to document the source code. Finally, a critical content type that exists in source code language files is the use of programmer comments to document the overall design or intent of the file. These comments are ignored by the compiler or runtime engine as non-functional statements and thus do not impact the runtime behavior. While ignored, subtle errors may be introduced in software when comment delimiter values, such as /* in the C language, are removed and create conditions where one comment delimiter opens a comment area and the corresponding closing delimiter is deleted by accident and thus the compiler sees the comment as extending all the way down to the next closing delimiter. Various software tools such as Computer Aided Design (CAD) applications, source code profilers and editors as well as pre-processors and the like have been developed over the years to eliminate errors, enhance the use of, accuracy in and coordination among various programmers who work on a shared set of source code files to collaboratively create a software application.
One example of a traditional software development tool used in the management of programming language source code text files is the programmer's editor or Integrated Development Environment, (IDE)—a tool that is used to host or edit the source code language file and which may be viewed as similar to a spreadsheet as the editor for the code within its storage matrix.
Another common application programming tool is a management tool used to control the repository of the many source code files themselves as well as the content inside each source code file—this is the software category known as Source Code Management (SCM) systems. Given the complex nature of software programming and the extensive use of individual source code text files to create complex applications, the development of SCM systems occurred in order to manage the complexity of which version of a source code file was being used to create or compile a single version of the application. Versioning and change management or history tracking systems are well known to the software development industry as useful tools to manage “source code” text files, because while these files are simple expressions of a particular computer language, the interaction among and across the files to specific functions, constants or features contained within other source code files has necessitated the need to know which version of a file was used to generate a specific piece of software executable or binary code.
Typically, while programmers use “editors” to manage the source text files, these source files are read and processed by a “compiler” to generate the application binary code which is run or executed on a computer to perform the work of the software application.
In a software application there are typically many source code files (10's, 100's or even 10,000's of files) that are used to create a single software application. The complexity of managing all of the changes to these native source code language text files and the complex relationship, linking and the interaction across and among these files is yet another example of a document or file management system. In this case, while SCM may be viewed to be similar to EDM applications, they differ in many ways. Specifically “source code control” or “source code management” systems manage more than just the files which are created and edited by the programmer. SCM systems are reading, inspecting and tracking the internal changes made to the text inside of the file on a line by line or even character by character or bit or byte level, as often times a small or minor change can have enormous impact on the compiling and operation of software generated from these SCM systems. Thus, what SCM systems provide to the software development market is a solution to the more complex and detailed problem of what specifically changed within a single file and what other files depend on that file as part of the “software build” process of linking and compiling software, and therefore they may predict or warn what impact that change may have across the set of dependent files.
In addition, another common feature or need of the programming community is to track, audit and know which version of a file was used on which day to compile a specific piece of a software product or version of that software product. Thus the SCM system must know at a low and detailed level what the complete and total history of changes are for any file it manages, regardless of how small or minor they may be, or where or how they have been made to any file under the control of the SCM system. With these features, the SCM may allow a programmer to “rollback” or “roll forward” from any one version of a single (or multiple sets of) source code file to another version, or to mix and match older versions of one file with newer versions of related files to generate a targeted or desired version of a specific software executable file.
The SCM's ability to track and manage all of the complexity of the native source code text is critical to the successful process of developing the correct or targeted version of a piece of software. SCM systems may also provide required centralized security, management and reporting of source code files (or pictures or other related documents including binary documents which are embedded within a software application).
Thus, SCM systems may be viewed as specialized EDM systems but without some of the sophisticated or native office application file open/save or UI type of features. Yet the SCM systems provide far richer and more robust file versioning and tracking facilities than that offered by existing EDM applications and solve a different and more complex category of problem.
To date, the SCM model has not been applied to the PC or cloud application data or to the management of documents or file world typically managed by EDM systems.