Issues concerning software due diligence and legal review have become increasingly important in view of recent software development trends, including prevalent usage of open source software and encryption technology in many software products. Although open source software can generally be used and shared for free, open source is not considered public domain because various licenses typically impose restrictions on the use, modification, or redistribution of the open source code (e.g., the GNU General Public License, the Berkeley Software Distribution License, etc.). Furthermore, because different open source licenses tend to impose restrictions that can vary significantly in scope, organizations that produce or otherwise develop software should take care to review and understand the various terms and conditions that may be associated with using open source software. As such, although open source software can provide various advantages (e.g., reducing the cost to develop reusable components), the use of open source should be carefully managed and documented to preserve intellectual property (IP) rights, avoid unpredictable royalty obligations, and otherwise prevent latent security vulnerabilities.
For instance, software development organizations often employ a common code base that can include hundreds or thousands of packages used in the development of various software products, with many of the packages potentially containing open source or being subject to various open source licenses. Furthermore, the packages in the code base can often further contain hundreds, thousands, or even millions of source files that may be subject to different open source licenses. Ensuring license compliance and compatibility for all of the software in a given product can be very difficult, as open source software typically originates from one or more upstream repositories or other sources that are beyond the control of the development organization. For example, the upstream repositories will usually declare various project licenses for a given open source package, whereby individual files within the package are further claimed under the declared licenses. In many cases, code developers and contributors would then be bound to the terms and conditions of the declared licenses. However, in various other cases, certain licenses may permit the addition of source code under other licenses deemed to be compatible with the declared licenses, or the licenses may permit the extraction of certain clauses or other portions under relaxed terms and conditions with regard to the declared licenses. As such, the compatibility of different open source licenses may not always be clearly discernable, as a given package may often several different software components (e.g., libraries, main application, test suites, etc.), yet different open source licenses may vary in whether they permit or prohibit the use of different or incompatible licenses for the various components.
Thus, one important concern associated with software due diligence review includes the need to ensure compliance and compatibility within individual components, along dependency chains, and among various components of a larger software product. In particular, a large number of known open source licenses exist, on the order of several hundreds, potentially creating many different variations and combinations of licenses that may be permitted and/or prohibited for a given software component. While some software development organizations use databases or package management systems to document the use of open source (e.g., Red Hat Package Manager (RPM)), the space available in metadata fields or RPM headers for describing software licenses is generally limited to a few words (i.e., typically one line of text). As such, existing systems fall short in providing a mechanism for representing and tracking known licenses, license versions, license compatibility, and other compliance issues according to a well defined and condensed syntax.
Other difficulties may also be encountered when performing software due diligence review, including the inspection of binary objects and the scalability of solutions used to manage large build systems and code bases. In particular, the build system and code base used within a given software development organization often includes binary files that have been compiled or otherwise constructed from source code. However, binaries tend to be more challenging to inspect for license compliance than the underlying source code because the source code typically includes text that can be inspected one line at a time, whereas binaries tend to be object files that often cannot be read using simple unpack and inspect processes. Regarding scalability issues, moreover, the utility of sequential pattern matching techniques tends to decrease substantially as the number of software components in a code base increases. Although parallel pattern matching techniques may address the issue of scalability to a degree, an important consideration in the use of such techniques is the need to ensure that no false negatives occur, while also ensuring that false positives do not unduly burden the review process. That is, false negatives may be considered unacceptable because they can lead to latent compliance defects, while excessive false positives may introduce unnecessary and costly delays in the review process.
Yet another concern relating to software due diligence arises in relation to software that uses undocumented and/or improperly documented open source. For example, open source that has been used in a given software component often lacks proper documentation for various reasons, such as developers not fully considering or understanding the legal issues that are involved with open source. In addition, open source components often carry prominent copyright or license information and liberally point to portions of borrowed code, but closed source components do not. As such, issues may arise when components appear to be closed source but actually contain open source components that lack proper documentation (e.g., a developer may use open source in a project but overlook the need to include proper license documentation, or insubstantial changes may be made to a few lines of the code in an effort to avoid license restrictions). Thus, another important aspect of software due diligence includes ensuring that open source is properly identified, documented, tracked, and reviewed for licensing compatibility and compliance.