Source Code Review: Mitigating Risks and Reducing Costs

Source Code Review is the most powerful tool in a litigator’s war chest in patent and trade secrets cases. An important consequence of the judicial climate shifting farther away from business methods and closer to technically complex IP is that receiving parties now face a higher burden of proof and subsequently higher legal costs. Not only are receiving parties now required to be more diligent prior to a case filing but they also end up spending extra thousands of dollars reviewing millions of lines of code to successfully formulating their infringement arguments.

On the other hand, with increasing complexity of software, legal costs have been increasing as well. Production of complex source code, in particular, increases the effort required to collect, triage, transportation and host the code during discovery – each of which comes at a financial cost. Furthermore, producing more code means stricter data security diligence by IT, eDiscovery executives and outside counsel increasing both the cost and the risk for the producing party. Notwithstanding, several strategies exist which can potentially keep costs and risks under control for both parties.

Size and Scope of Code Production

A significant cost and exposure risk can be avoided simply by a diligent assessment on both sides as to what source code needs to be produced to the receiving party. In a typical discovery, receiving party counsel and producing party counsel correspond, argue and negotiate at length about what constitutes relevant source code. The production process is almost always iterative and with each iteration, legal costs such as attorney hours, e-Discovery hours add on, and crucial discovery time is wasted – all of which can be avoided on both sides by an upfront good-faith meet and confer early in the discovery. Such a meet and confer should ideally address what products constitute discoverable source code, whether open source modules will/will not be produced, whether the entirety of code will be produced (or which specific modules), which custodians will the code be taken from. High-level details on the programming language, development platforms, file count and/or file size can further be very helpful in reducing the number of iterations of code production and thus legal costs for both parties.

Representative Versions

Technology companies rarely are one-product shops. Companies like Intel, Google, Cisco, Hewlett-Packard, etc. make and sell dozens, even hundreds, of products and services, several of which may be accused in a patent dispute. All products or services encompass their own, often overlapping, source code. Further, source code undergoes several modifications throughout the development process. Software teams realize the importance of tracking and maintaining each and every version of the code for later retrieval. This is done usually through a robust versioning control software such as Git, Perforce or Apache Subversion. As a direct result of changes to the code through such a versioning control system, codebases can sometimes encompass hundreds of versions of the same code.

It is common therefore for producing parties to produce only the most recent commercially available version of the code of each accused product at the time of production. However, collecting this most recent version often requires considerable diligence on who among the custodians has the most complete copy of the code for each product. Further, technology moves fast and bearing in mind that a typical patent dispute lasts north of 24 months in most jurisdictions, the “most recent commercially available version” is often a moving target – and still requires iterative productions as and when products are updated. Receiving parties may also affect additional iterations if they require past versions to assess when a particular functionality was added, deleted or modified.

On the other end of the spectrum, some producing parties choose to reduce the pre-production diligence and simply produce the entirety of their versioning platform (such as Perforce repositories). In such cases, the code production consists of nearly all current and past versions of the code for all the accused products – which can cumulatively add up to many terabytes of data.

Designating representative versions can significantly reduce costs and streamline discovery for the receiving party. It also simplifies the damages assessment. For producing parties, cost of code collection and hosting is reduced as well – and security risks from producing too much valuable code are minimized – for example, optimization algorithms used in non-representative versions that are not critical to the case are not exposed to external experts.

Number of Reviewers/Computers

Limiting the number of reviewers who can be given access to the source code (whether simultaneously or otherwise) is a very useful stipulation to consider when negotiating a protective order. The limit on number of reviewers can be enforced either explicitly in the protective order or indirectly through limiting the number of computers on which code is produced.

For producing parties, a limit on number of reviewers has obvious advantages. It limits the exposure of confidential information to a finite manageable number of experts. For receiving party if the limit is reasonable (for example a number between 3 and 6 depending on the number of accused products and size of production), it provides better visibility and control on review costs; but bear in mind however that lower limits on the number of reviewers can place stringent constraints on the discovery process and jeopardize deadlines, which is why it is recommended that counsel hold a prior discussion on the size and scope of production to arrive at a well-informed and reasonable agreement on the limits.

Location Of Review

Counsel must decide on the location at which the source code review will occur when negotiating protective order stipulations. Hosting a code review requires specialized security processes and infrastructure, in addition to legal manpower for supervising the review. There are two important questions counsel should address when deciding on the location:

Who will host the source code?
Where geographically will the source code be produced?

In answering the first question, most protective orders stipulate code to be produced at the producing party’s counsel office where the production can be closely monitored, in which case the producing party assumes most if not all expenses related to the production. There are relatively fewer cases where the source code is produced at the receiving party’s counsel office which minimizes the expenses and legal manpower borne by the producing party for the review. Since it also entails lesser control over the security and integrity of the code for the producing party, this approach is recommended for cases that involve a significant portion of open source components or code that is otherwise not extremely sensitive and critical to the producing party’s core business. As a third approach, a software escrow with robust, certified security provisions can provide a neutral middle ground with the producing party and the receiving party sharing the costs. Most escrows, like Iron Mountain, operate escrow sites across the world with each site fully equipped with heavy duty security, dedicated supervisory staff, biometric access and video surveillance.

In answering the second question, counsel should deliberate on where, geographically, is it most cost-effective for the client to produce the code. This could be in a city where the producing party’s counsel are primarily located, where the receiving party’s counsel are primarily located or a city in which the technical experts are located. For cases where the source code is highly modularized and the entirety of the software product is not subject to discover, it is generally recommended that the production be hosted in a city where the producing party’s counsel are primarily located so that requests for additional code and printouts can be addressed promptly. For large production sizes, counsel should consider hosting the production closer to where the technical experts are located – which can greatly reduce costs of travel and logistics.

Electronic Devices in Review Room

Source code should preferably be hosted on standalone computers that are isolated from local and external networks. Electronic devices, in particular those with storage interfaces such as USB drives, disk drives, CD/DVD/Blu-Ray, smartphones and tablets should be prohibited in the review room in order to enforce information security under the protective order. In addition to visually monitoring compliance with this provision, counsel may disable external device interfaces (USB) on the review computer or enclose the computer in a lock-box which allows wires for connecting to the monitor, keyboard and mouse but disallows any other devices to be connected to the computer.

Some of these restrictions constrain the receiving party’s ability and efficiency in reviewing the source code production. For example, the receiving party’s expert might want to be able to use their laptops for taking notes in the review room which can drastically reduce the effort required by the counsel and expert to prepare for depositions and expert reports. Counsel should consider if a note-taking laptop can be allowed to be brought into the review room, provided certain restrictions are applied to ensure the security and integrity of the code production. For example, a separate computer, also isolated from any networks, can be provided for the reviewers to take notes on.

Further, if reviewers are allowed to bring their own laptops into the review room, counsel should consider use of a tamper-evident tape that can prohibit use of USB interfaces, cameras, etc. Wireless networks, if any, at the review location should be monitored and password-locked.

Code Review Tools and Software

A number of source code review tools are available to experts and counsel that can increase efficiency and visualization of the produced code. It is often preferable that counsel discuss and agree on the list of software tools that will be available to the experts on the review computer while formulating the protective orders. At a minimum, a content search tool and a code review platform should be installed on the review computers – which together can reduce the effort and length of the review by over 80%. Examples of popular review tools include:

Content Search Tools

Windows Grep
PowerGREP

Code Review Software

Eclipse SDK – supports Java and through plugins, Ada, ABAP, C, C++, COBOL, Fortran, Haskell, JavaScript, Julia, Lasso, Lua, NATURAL, Perl, PHP, Prolog, Python, R, Ruby, Rust, Scala, Clojure, Groovy, Scheme, and Erlang.

Scitools Understand – supports Ada, C/C++, Objective C, Objective C++, C#, FORTRAN, Java, JOVIAL, Delphi/Pascal, PL/M, VHDL, Cobol, PHP, JavaScript and Python.

Microsoft Visual Studio – supports C, C++, VB.NET, C# and F#.

Codelite – supports C, C++, PHP, and JavaScript.

Xcode – supports C, C++, Objective-C, Objective-C++, Java, AppleScript, Python and Ruby.

Netbeans – supports Java, PHP, C/C++ and HTML5.

BeyondCompare – for side-by-side comparisons of file content, language-independent.

Notepad++ – enhanced text editor with code coloring, indentation and line numbers.

In addition to above specialized tools, counsel should also consider installing a productivity software such as Microsoft Office/OpenOffice, Adobe Acrobat Reader for reading documents and spreadsheets that might be part of the code production. The production might also further include specific file extensions that require specialized software. As discussed earlier, an initial conversation about what is being produced can help reduce redundant iterations and associated costs for the client.

Volume of Printouts

Counsel often desire to limit the number of pages that the receiving party can request to be printed. These limits can be limits on the total number of pages, number of duplicate copies and/or number of consecutive pages within each file that can be requested. As an example, a protective order may limit the receiving party from requesting:

more than 10% of the total code produced.
more than 15 consecutive pages.
more than 250 pages per product version.

The specific numbers in the limits may change according to the respective scope of the case and the volumes of code that are produced, but in essence the limits are yet another very effective tool that counsel can use to reduce both costs and exposure of the code.

Conclusion

Source code production and review are highly critical components of an IP litigation – and require deep due diligence from counsel to ensure that clients’ trade secrets are not unduly compromised, discovery is streamlined and that the best value is provided to the client especially for the most labor-intensive components of the case. Counsel are encouraged to evaluate all necessary aspects, including but not limited to those highlighted in this paper, while negotiating protective order provisions and preparing for a code review on their cases.

Rahul Vijh s a seasoned technology consultant and has advised AmLaw 100 firms and Fortune 500 corporations on IP licensing and litigation in technology areas such as telecommunication networks, enterprise software, search [...see more]

Warning & Disclaimer: The pages, articles and comments on IPWatchdog.com do not constitute legal advice, nor do they create any attorney-client relationship. The articles published express the personal opinion and views of the author as of the time of publication and should not be attributed to the author’s employer, clients or the sponsors of IPWatchdog.com.