Technical

Making sense of OWASP A08:2021 – Software & Data Integrity Failures

Dima Kotik
Application Security Engineer

New OWASP 2021 Top Ten List includes new categories. This time around, the list item number A08, Software and Data Integrity Failures, offers insight into the changing nature of application security and evolving data threats. Let's explore the definition, common causes, and possible solutions.

The Definition

Software and data integrity failures may sound like supply chain problems, but the category stands by itself for two reasons. First, there is a separate A06:2021 category for "Vulnerable and Outdated Components, which is also a class of supply chain problems. The difference? The A06 category groups vulnerabilities inherent in the components, but A08 groups vulnerabilities injected into the payloads of otherwise benign artifacts before endpoint delivery.

Second, the category remains the best fit for deserializing untrusted data, which is currently being eclipsed by supply chain poisoning. It is, however, still sufficiently common to hold a position in OWASP Top Ten list, even though no longer by itself.

However, the category definition is slightly confusing because the title describes the symptom and not the root cause. It is like saying: "Temperature is a top ten world disease." How did those integrity failures occur?

Altogether, software and data integrity failures lead to a program either straight up executing the attacker's code or prying open a backdoor via combined measures.

Common causes

There are ten CWE under this category. First, we'll categorize them into three buckets. Then, we'll examine the threat vectors of each.

   1.Faulty assumptions 

CWE-565 and CWE-784 constitute lazy control checks that assert a presence of a particular cookie or a cookie value. Even if a signature is present for validation, it is not checked. It does not help that JWTs carry unencrypted payloads. Much ink has been spilled about that. The assumption being made is that the mere presence of a cookie is sufficient for trust.

The bulk of CWE-426: untrusted search path vulnerabilities emerge from the assumption that the host's environment is secure. The OS PATH value is the main culprit, but a vulnerability can be triggered with various classic path traversal and redirection exploits.

    2.Misguided mechanisms

CWE-915, CWE-829, and CWE-830 emerge when existing controls botch responses to system dynamism. Imported functionality is in vogue. Objects are initialized or configured using values obtained across multiple boundaries and abstraction layers.

Those vulnerabilities appear to be primarily honest mistakes. It is hard to foresee, model and test for thousands of combined states an object may assume when several of its components or settings are externally sourced.

We have become prisoners of complexity. Is this an access control authorization problem? Likely. The bad news is that we are not anywhere close to solving authorization problems. Quite the opposite: we are excellent at breeding them and running into dead ends in search of a solution. It is difficult to imagine a future where mistakes like these are rare.

The best strategy is to strive for simplicity. But alas, this is what we have in 2021: a series of engineering compromises. When we cannot afford to rewrite or refactor, we externalize functionality and set ourselves up for failure.

    3.Absence of a mechanism

CWE-494 is the worst kind of input validation failure. Executable or compilable code is fetched whole or as a patch. Then, it is executed without any integrity checks. The attacker can inject malware into the code at the source, in transit, and even at the endpoint cache.

CWE-502 is the classic deserialization of untrusted data. Parsers are fickle because they are complex. Even if they come with a promise to be secure by default or built with security in mind, that promise must be entirely disregarded. Always validate input before deserializing.

CWE-345 and CWE-353 together represent a similar problem that manifests at two different levels. The first is at the protocol level. A checksum may even be present, but it is not checked. The second is at the application level: there is no support for an integrity check for incoming data.

We could blame those kinds of vulnerabilities on unreasonable development deadlines that sometimes force developers to cut corners. However, it is most likely that the absence of a protection mechanism here is due to disregard of the necessity of rigor in threat modeling. Any of the common threat modeling systems lead us to scrutinize and protect downloaded updates and to validate all data before parsing.

Summary of common causes

Presented this way, the causes suggest that we are reaping the consequences of either the fashion or the necessity of microservice architecture. Developers split up the services without putting in the necessary integrity checks. Previously made security assumptions do not carry over to a modern distributed stack. Threat modeling reset is mandatory.

There is another likely contributing factor. The remote procedure call (RPC) protocols are becoming so much better and easier to use. The industry is converging rapidly on the gRPC suite. Developers rely on gRPC, frameworks, libraries, and much more than they ought to. Perhaps, half of the assumptions stem from the expectation that an RPC protocol or a trending NPM utility package already ensures data integrity. Similarly, an update coming from a familiar source is also deemed safe.

Conclusion

We should expect this category to rise higher within a few years. Supply chain poisoning is difficult to detect and prevent. Our countermeasures are, arguably, in infancy.

Zero-trust is not trivial to implement, but we can begin by setting some goals and tuning that healthy hacker skepticism up. Putting less trust in our infrastructure will serve well.

Given the blast radius of those vulnerabilities, your company should be budgeting for learning and hedging against those types of attacks. The key to finding a solution is treating your update delivery infrastructure and data sources as if attackers have already nested in them. Build a more short stack. Beware of externalized functionality. Pay more attention to data integrity and access control.

Ready to start your journey?

Let's Talk!