Rackmount Improv HSM

This is a followup to Why I don't like smartcards, which I felt like writing after it got posted on HN.

CACert was (and is) an attempt to create a community-run CA. However, it never managed to obtain the funding to perform the audits necessary to get included in OS and browser trust lists, and, with the launch of Let's Encrypt, its relevance is only going to decrease further.

I recall reading, years ago, on CACert's website about the homemade HSM-like server they rigged up. While CACert never was professionally audited to OS/browser standards, as far as I am aware, it seems like a valid design.

As I recall, they setup a server without a network connection and connected it to a network-connected server only by an RS232 serial line. They then implemented a simple (so very limited attack surface) protocol on this serial line allowing operations to be done with the key stored on the non-networked server. Essentially this provides the properties of an HSM, albeit without the physical tamperproofing part. But since the physical security of the server can presumably be assured by other means, this probably works well enough for the CACert use case. (Of course for a CA I'd hope they have procedures in place to enforce n-man access...)

I recall hearing that Amazon adopts a similar model to store credit card information, a “card motel”. Cards check in, they don't check out. The card storage facility supports only a few operations: probably they are add new card, charge card, delete card, get last four digits. So that's at least two instances of datacentre implementation of what essentially amounts to an improvised HSM, thanks to the ability to assure the physical security of the device.

This strategy could also be adopted for password verification. We see password hashes get leaked from sites so often nowadays, adopting a password hash motel might make sense.

An Aside on the Use of Ethernet for Low Attack Surface Interfaces

For those thinking of replicating the design, I'd go with RS232 instead of Ethernet. It's low speed but likely to be good enough for many cryptographic applications, and if you use Ethernet you are passing the requests and responses through a great deal of NIC firmware, NIC driver code in kernel space, and the networking stack of Linux (or whatever OS you're using). For the level of paranoia which should be implied by the term “HSM”, Ethernet as it is implemented today simply doesn't cut it. Even if you implemented a bypass for the OS networking stack and attached a userspace program directly to the NIC (I believe projects to facilitate extremely fast packet processing via ring buffers solely in userspace have done things like this), you're still left with the issue of the NIC firmware. And frankly modern NIC firmware does too much, especially on server-class machines. You have things like TCP checksum offload, or reading port numbers to load balance incoming packets into multiple hardware buffers, as I recall, and so on. In other words this firmware is violating the demarcation of its own native OSI layer and examining higher-level data (which it seems practically every device nominally operating at some given OSI layer does nowadays). So long as firmware is examining the payload of an Ethernet frame, however trivially, I'm worried about its vulnerability, especially if it's closed source.

For an example of how worrying NIC firmware is, here's a story about an Intel NIC which can be rendered inoperative via a packet of death. This didn't even need to be deliberately formed, it occurred naturally during network usage: see this report.

Though they're not NICs, there also things like “DSL modems” which nominally should be neutral carriers translating between mediums, yet in practice inexplicably drop traffic with certain characteristics at IP level. The real concern is that those areas of the frames they forward should not even be evaluated by the modem in the first place, or be loaded into CPU registers for any purpose other than memcpy(). It seems like many of these devices which are supposed to be transparent in fact run Linux, because it's easy to do, and bridge the interfaces together. Of course this creates the risk that the Linux networking stack in all its featureful complexity will evaluate parts of frames unnecessarily. For the purposes of a network device operating at some given layer, and in a world of buggy firmware, every unnecessary payload byte examination is a liability, and a potential breeding ground for bugs. One incident occurred with a VDSL “modem” dropping certain kinds of IP traffic(!).

On Evaluation of Bytes

Although this really would be indulging me, it would be interesting if hardware vendors would start specifying in their datasheets what bytes of frames were and were not evaluated (even to the most trivial degree), and under what conditions. A relevant question is to what extent a device which would traditionally be considered an OSI Layer 2 device, say an Ethernet switch, but which has the capability to do Layer 3 things; for the sake of example, say, the ability to filter packets by IP address, continues to load such bytes from the packet into registers even when such functionality is disabled.

That is, if a “Layer 2” switch which can drop frames by source IP address is configured to make no use of that functionality, does it still load the IP address and then (hopefully, if bug-free) make no use of it? The risk of bugs would be greatly mitigated if it simply did not load the bytes into registers the first place.

The issue with, shall we say, “overcompetent” network devices and NICs is so concerning given the bugginess so common of software and firmware especially, that ideally I'd like to see network devices commit in published specifications (so, ultimately, legally, trade-wise) to not evaluate higher-layer payloads unless and only unless functionality requiring such is specifically enabled.

This is, of course, a fantasy. Pushing for open source, and thus auditable, NIC firmware is a more likely solution (which is not to say that it's likely to happen, but it is still orders of magnitude less improbable).