How Vulnerable is Your Language?

At ESC last month, Green Hills’ Dan O’Dowd made the point that, increasingly, software is becoming the dominant element in an embedded product: it is often the most expensive element of the development costs; it is often the most difficult part of the development; it is frequently late; and it may be the part of the product that creates the most problems when field deployed. On the positive side, it can provide a way to differentiate from other products in the market, and software has the flexibility to create multiple versions of a product running on the same base hardware. Despite this, O’Dowd pointed out, the investment in tools, both hardware and software, for the software developers, is tiny. This is something we will be returning to later in the year.

What he didn’t cover is that this software is increasingly used in connected devices: where embedded systems used to be stand alone, they are now normally connected to the Internet, forming part of the Internet of Things. This makes them vulnerable to malicious attack in a way that they were not before. We are now accustomed to having firewalls and antivirus software on PCs and protecting personal and corporate networks, yet we continue to build and deploy devices that attach to the internet without any protection. Perhaps it doesn’t matter very much if the software in our internet-enabled toaster is damaged, but at the other end of the complexity spectrum are things like industrial control equipment. If these are hacked, then the consequences are grave and may be life-threatening. Stuxnet, which targeted PLCs from Siemens, is probably not a good example here, as the Siemens devices were running Windows, and it is alleged that the Stuxnet worm came from US/Israeli government sources. But once someone has demonstrated that it is possible to bring industrial processes to a halt for political reasons, it is probable that others will try to do it for no better reason than to demonstrate that they can.

While operating systems are often the unwitting providers of doorways into systems, the choice of language can make a large difference in the strength of a system. For many embedded projects, the choice of language is frequently never an issue: it is C or possibly a C variant. Why? Because that is what the software team knows how to write. But work in analysing software vulnerabilities suggests that this automatic choice may carry even more baggage than has previously been suspected.

Much of this analysis has been carried out with funding from the US Government, particularly the Department of Homeland Security and NIST (National Institute of Standards and Technology) and is looking mainly at what one might regard as enterprise software. However, as embedded software becomes more complex and more attached, then the research is increasingly relevant. As part of the work carried out in looking at problems with software, various organisations compiled lists of the reasons why software had problems. Analysis of this has created the Common Weakness Enumeration. To quote from the web site, hosted by MITRE:

Targeted to developers and security practitioners, the Common Weakness Enumeration (CWE) is a formal list of software weakness types created to:

Serve as a common language for describing software security weaknesses in architecture, design, or code.

Serve as a standard measuring stick for software security tools targeting these weaknesses.
Provide a common baseline standard for weakness identification, mitigation, and prevention efforts.

From this analysis, workers in the field can look at different languages to see what weaknesses they will permit, and at tools, such as source code analysis, to see which of these weaknesses they can find in source code. In this context, NIST produced a Source Code Analysis Tool Functional Specification, which sets out the criteria for analysis tools. In an Annex, it lists the code vulnerabilities in existing software that are capable of being exploited, using the CWE classification.

To quote the document:

Criteria for selection of weaknesses include:

Found in existing code today – Corresponding vulnerabilities are found in existing software applications.
Recognized by tools today – Tools today are able to identify these weaknesses in source code and identify their associated file names and line numbers.

Likelihood of exploit is medium to high – The vulnerability is fairly easy for a malicious user to recognize and to exploit.

Some typical entries are shown below.

Name	CWE ID	Description	Language(s)	Relevant Complexities
Input Validation
SQL Injection	89	Inadequately filtered input is used in an argument to a SQL command calling function.	C, C++, Java, SPARK	taint, scope, address alias level, container, local control flow, loop structure, buffer address type
Range Errors
Stack overflow	121	Input is used in an argument to create or copy data beyond the fixed memory boundary of a buffer on the stack.	C, C++	All
Code Quality
Memory leak	401	Memory is allocated, but is not released after its final used.	C, C++	scope, address alias level, container, local control flow, loop structure

The first edition of the Specification looked at whether these vulnerabilities were present in C, C++ and Java. In the latest edition (V1.1, February 2011), the list of languages was extended to include SPARK. The rationale for this is:

There are languages that are, by design, more suitable for secure programming. We added SPARK as an example of one. Such languages entirely preclude many common weaknesses and minimize or expose others. Choosing such languages mitigates many security risks.

In the list of 21 vulnerabilities, SPARK appears only 5 times, a quarter of the vulnerability of the other languages. So, what is SPARK?

SPARK is a formally designed subset of the ADA language, augmented with annotations that are designed to assist in the automated proof of program properties. It is supported by UK-based Praxis (now part of the French Altran group) and forms a part of that company’s push towards safe and secure programming. The SPARK language is being deployed in a wide range of safety-critical applications.

Part of SPARK is a toolset, which includes a static checker (Examiner), an automated theorem prover (Simplifier), and an interactive proof tool (Proof Checker). Well before compilation, the developer can have a very high level of confidence in the correctness and lack of vulnerability in the code produced. Any Ada compiler can be used to compile SPARK, and an open source SPARK tool set, SPARK Pro, is available from AdaCore.

If SPARK is so much less vulnerable than C, why is it not being more widely used? There are a number of reasons, and the one most often used is that it is difficult to hire SPARK programmers. The best riposte to this was one I heard at a conference a year or so ago, “We hire good programmers and then train them in SPARK. In the long run it is cheaper than spending huge amounts of effort debugging software.”

There is also a strange inertia in the software community. It might be, in part, a reluctance to abandon hard-won experience in a programming language. This is exacerbated by many programmers knowing only one programming language in great depth. If you have been taught C at university, say, and all your professional experience is C-centric, then it is understandable that you are reluctant to change. But C is just one of the languages out there, and it may not be the best for a specific application.

In particular, it is clear that C contains inherent vulnerabilities that make the task of the programmer much more difficult if the end product is to be stable and resistant to attacks by malicious hackers. As more and more safety-critical systems are connected to the Internet, either directly or indirectly, there will be more and more opportunities for malware to attack those systems. Will it take a major disaster before the use of better languages and their related tools becomes widespread?

In case checking for CWE in your software seems a little abstract, code verification company LDRA, announced earlier this year that their tool suite has achieved compatibility with the CWE. This covers both static and dynamic code validation tools.

How Vulnerable is Your Language?

Related

Leave a Reply Cancel reply

featured chalk talk