feature article
Subscribe Now

Anatomy of a Software Code Audit Process

Software has become a major component of products that are produced by most technology companies and is rarely written from scratch. Resourceful software development organizations and developers use a combination of previously-created code, commercial software, open-source software, and their own creative content to produce the desired software product or functionality. Anytime a product containing software changes hands, there is a need to understand its composition, its pedigree, its ownership, and any third-party (including open-source software) licenses or obligations that govern its use by its new owners.

Avoiding Uncertainties in a Technology Transaction

Technology transactions that involve software may include the launch of a product into the market, mergers & acquisitions (M&A) of companies with software development operations, or technology transfer between organizations, whether they are commercial, academic or otherwise public.

Any uncertainty around either ownership of software or compliance with the licenses associated with software can:

  • deter downstream users,
  • reduce the ability to create partnerships,
  • create litigation risk to the company and downstream users,
  • increase risk and threaten closures in funding deals,
  • negatively impact M&A activities,
  • increase product time to market, and
  • affect company valuation.

So how can all of this be avoided?

A software code audit is a good way to determine what is in your software product. A software code audit should not be confused with the more commonplace software audit process: the latter generally has to do with making sure you have paid for the software applications (e.g. Microsoft Office) you are using in your organization. Software code audits identify building blocks (files or software modules or packages, or even five lines of external code) that are used in a product or exist in the code inventory of an organization.

The audit process establishes code ownership, licensing, or copyright obligations around any third-party content in the code portfolio; it also identifies code authorship, package versions, and export restrictions. Software code audits can also highlight alignment with the policies around either use or delivery of software in a particular organization. Software code audits can also pinpoint code reuse between different portfolios within or across organizations.

The common mistake is to start a code audit process only in the last step of a transaction.  Starting the audit in anticipation of a transaction allows for timely correction of any shortcomings detected during the audit. You certainly do not want to delay a transaction because of uncertainties uncovered during the audit.

What Is Software Code Audit Process?

Except in the simplest, smallest code portfolios, a manual audit of a company’s code portfolio takes time, is inaccurate, and is expensive. Automated code scanning solutions can sift through large portfolios quickly and efficiently, detecting outside code and retrieving licensing and other attributes of external components. While automated code scanning solutions operate very fast, there is still a human element to a thorough audit project. Our experience has shown that most of the time an audit is taken by the front-end process of audit preparation and back-end results interpretation and report generation.

The software code audit process usually involves:

  • Establishing a legal framework (NDA) between the parties involved and the auditor.
  • Questions and answers between the parties involved to establish:
    • the objectives of the audit, to understand the company or product that is audited;
    • the specific business of the target companies; 
    • their third-party software practices;
    • the software environment that is used in the target company; and
    • their open source adoption policy (if any).

In some cases, all code that must be audited is not in one place, or must be “assembled” before an audit is carried out. Depending on the size of the project, the front-end process can take can take 1-5 days.

Software Code Scanning and Detection

Once the legal framework is in place, the code is available, and the environment discovery process is complete, an automated scanning application is set up. The complete job is broken into logically-meaningful segments (for example, identifiable subprojects and modules), and then the actual automated code scanning is carried out. Ownership warnings generated by the automated application (such as proprietary code without appropriate headers or copyright information, or conflicting license information) are brought to the attention of the staff, and either resolved or marked for further action.

The reports created by the automated solution are reviewed by an expert audit staff, and a final executive report is assembled. Depending on the size of the audit project, this step can take as little as a couple of days (small project containing thousands of files) and up to two weeks (for a very large portfolio of hundreds of thousands of software files). 

End-Results of a Software Code Audit

The end result of a software code audit is a combination of two reports.

The first is a high-level executive overview report that is custom-created by the audit staff. This report defines the software code audit environment, the process used, and the major findings, in simple graphical and tabular format. Attention is drawn to specific packages, files, or licenses. Information on commercial or open-source software components, a description what each piece of software does, who created it, and related references on public-domain project websites should be provided.

Important information such as copyright owners, licenses associated with the discovered software packages, and, optionally, encryption or export obligations, are tabulated. The text of all licenses that are discovered is included with this report. The report lists all external content, including complete third-party software files, modules, or projects as well as snippets of code that have a code structure similar to known open source projects. The findings of a software code audit must be verifiable; therefore references or hyperlinks to all information that is discovered should be provided.

The second report is a detailed machine-generated report that lists:

  • all packages, files, licenses, copyrights, etc. associated with all software files in the target portfolio, and,
  • optionally, a license obligation report, summarizing the obligations associated with all licenses found in the portfolio.

The detailed report can be very large and is normally provided as a back up to the high-level executive report. This report is normally consulted if a specific project or file requires further scrutiny.

How Much Does a Software Code Audit Cost?

Generally the cost of an audit is proportional to the complexity of the project, which in turn can be roughly defined as the number of files in the target portfolio, the nature of the packages (commercial or public domain) used in the portfolio, and the information that is available about those packages. Most audits (thousands and up to hundreds of thousands of software files) fall within a $5-$40K range.

If you’re planning a specific transaction involving software assets, whether it’s an M&A, equity investment, product introduction, demand for IP indemnity, commercialization of research, or other event, conduct a software code audit as early as possible in the transaction. Knowing what’s in the code can speed up transaction times and reduce costs associated with fixing problems at the last minute.

Kamal Hassin, Director, R&D and Product Management at Protecode, is a thought-leader in the area of open source licensing and is the author or co-author of a number of papers on Software Intellectual Property management.  Kamal has a Bachelor of Engineering degree and a Masters degree in Technology Innovation Management from Carleton University.  He can be reached at khassin@protecode.com.

5 thoughts on “Anatomy of a Software Code Audit Process”

  1. Pingback: GVK BIO
  2. Pingback: DMPK
  3. Pingback: www.mine-craft.me
  4. Pingback: Continued

Leave a Reply

featured blogs
May 26, 2022
Introducing Synopsys Learning Center, an online, on-demand library of self-paced training modules, webinars, and labs designed for both new & experienced users. The post New Synopsys Learning Center Makes Training Easier and More Accessible appeared first on From Silico...
May 26, 2022
CadenceLIVE Silicon Valley is back as an in-person event for 2022, in the Santa Clara Convention Center as usual. The event will take place on Wednesday, June 8 and Thursday, June 9. Vaccination You... ...
May 25, 2022
There are so many cool STEM (science, technology, engineering, and math) toys available these days, and I want them all!...
May 24, 2022
By Neel Natekar Radio frequency (RF) circuitry is an essential component of many of the critical applications we now rely… ...

featured video

Building safer robots with computer vision & AI

Sponsored by Texas Instruments

Watch TI's demo to see how Jacinto™ 7 processors fuse deep learning and traditional computer vision to enable safer autonomous mobile robots.

Watch demo

featured paper

5 common Hall-effect sensor myths

Sponsored by Texas Instruments

Hall-effect sensors can be used in a variety of automotive and industrial systems. Higher system performance requirements created the need for improved accuracy and more integration – extending the use of Hall-effect sensors. Read this article to learn about common Hall-effect sensor misconceptions and see how these sensors can be used in real-world applications.

Click to read more

featured chalk talk

Software and Automotive Safety

Sponsored by Siemens Digital Industries Software

In the realm of automotive designs, safety must reign above all else. But the question remains: How can we innovate within the constraints of today’s safety standards? In this episode of Chalk Talk, Amelia Dalton chats with Rob Bates from Siemens about the role ISO 26262 plays when it comes to COTS and open source software, what certified software components are all about, and how heterogeneous multiprocessing can be helpful in your next automotive design.

Click here to download the whitepaper called "Is it Possible to know how Safe we are in a World of Autonomous Cars?