The first thing that struck me in the article was this line:
TrueCrypt is a project that doesn’t provide deterministic builds. Hence, anyone compiling the sources will get different binaries, as pointed by this article on Privacy Lover, saying that “it is exceedingly difficult to generate binaries from source that match the binaries provided by Truecrypt.”
The reddit comments section had a few interesting insights on that note – e.g. the idea that you could hypothetically maliciously create custom binary versions on dynamically built download pages to different classes of users, including backdoors as wanted and generating new cryptographic hashes for downloadable source/binary packages on the fly.
One common theme was the impracticability of “just check the source” – the above issues would render such a process useless, and it was pointed out that, for the average user, it’s simply not practical. This reply makes an assertion (which I disagree with) that you don’t necessarily have to resort to such measures:
But that’s okay. You implicitly trust a bus driver to not drive the bus over a cliff when you get on it, so you can trust that most software writers have no malicious intent toward you. Don’t drive yourself mad reading source code for days. Look out for warning signs, exercise common sense, and realize that if somebody with enough resources really wants you specifically, they will get you.
Unfortunately this is only partially true. I had a thought about this exact problem – even for large enterprises, building and auditing source is a pain in the ass. In one of my jobs, I ran a pretty sizable team dealing with both manual code review and automated static analysis, as well as pen testing and other ways of providing some degree of assurance that there weren’t avoidable security holes (either bugs or backdoors) in both in-house and externally obtained applications.
It’s partially right to (generally) assume “no malicious intent” – but this is where the fact that most people haven’t the slightest clue of how “risk” really works comes in. One performs a risk assessment every time you make a decision like what you described; in big companies, it’s bad because most have fairly complex frameworks detailing how to do this, which often end up in formulaic processes that don’t really provide risk transparency, but rather a sort of tick-box fig leaf alibi that lets them say “yeah, we understand the risk of not doing this”. No, you really don’t. You understand a number within a certain set of parameters without (usually) understanding what you don’t understand – Donald Rumsfeld’s famous “known unknowns” and “unknown unknowns”, which, in retrospect, aren’t nearly as stupid as they first sounded.
Ultimately, it comes down to some variation of probability * impact – and for any company of even moderate complexity, there’s always enough software with the chance of a hole having massive impact to make it necessary to test code.
It’s a hugely expensive process – but for me, a positive side effect of the whole NSA clusterfuck is that companies have been forced to finally start systematically confronting the elephant in the room that they’ve been ignoring for years, that being, can you trust your vendors / open source? No, not really, and you’re going to have to spend a lot of money and time to do it right.
So I wonder – there are e.g. static and dynamic analysis tools out there, but most are barely usable due to the highly variable nature of code or the opacity of compiled binaries. They, as with other testing tools, are typical of the most common approaches to testing – i.e. looking into the final product. Some vendors, e.g. Veracode, take the approach of systematically evaluating vendor-provided software, thus basically creating a test factory that a consumer then pays to be able to trust.
However, nobody is doing what Xavier de Carné de Carnavalet, author of the original post, is trying to do – evaluting whether you’re getting software that has its integrity intact. Reading through his article shows the complexity and cost in time and expertise of doing this.
But why not try to streamline the process of checking integrity itself? A hypothetical approach would be a standardized framework for ensuring compiled code integrity – starting with the already commonly-used formats for source package signatures. Most code packages already come with hashes/signatures, so you can verify that. And many of the steps in the article could be automated – the number of false positives and reports for review in comparing certain hex strings would surely be significantly less than with the average static analysis tool. And it’d provide a strong additional layer of assurance for consumers of commercially built open source tools.