In the future, even your RAM will have firmware; and the subject of POWER10 blobs
Industry trends: serial memory attachment. One of the most interesting things which will happen in the near future in computing is the adoption of serial memory interfaces. For pretty much the entire history of modern computing, RAM has been attached to a system via a high-speed parallel interface. Making parallel interfaces fast is hard and requires extremely rigorous control of timing skew between pins, therefore the routing of PCB traces between a CPU and RAM slots must be done with great precision. At the speeds of modern parallel RAM interfaces like DDR4, what is theoretically a digital interface in practice must be viewed as practically analog (to the point that part of a CPU DDR4 controller is called the “PHY”). Moreover, the maximum distance between a CPU and its RAM slots is extremely tight. The positioning of RAM slots on a motherboard is largely constrained by these physics considerations. For these reasons you have never seen anything like the flexibility with RAM attachment that you can get with, for example, PCIe or SAS. PCIe and SAS are serial architectures which support cabling and even switching, allowing entire additional chassis of PCIe and SAS devices to be attached to a system via cables.
Parallel RAM attachment methods like DDR4 and DDR5, by comparison, are both inflexible and pushing the physics to the limit (it is unclear to me whether there will even be a DDR6). Due to the complexity of the analog concerns when running parallel interfaces at such high speeds, the size of a “PHY” IP block gets larger and larger with each successive iteration of DDR, taking up more room on a CPU's silicon die. For a CPU with eight memory channels, the amount of space taken up by DRAM controllers and PHYs is now substantial.
For these and other reasons, the move to serial DRAM attachment is being considered by industry, most notably by IBM. The idea is that a multi-lane serial interface, not unlike e.g. PCIe would be used to attach DIMMs rather than a parallel interface. These are being called Differential DIMMs or “D-DIMMs”. Typically, these DIMMs would contain a serialization/deserialization chip between the serial interface and DDR4 or DDR5 RAM. The parallel interface still exists, but only as an internal detail of the DIMM.
There are a number of advantages to this approach. First of all, it completely decouples CPUs from the type of DRAM they are designed to work with. A system accepting D-DIMMs produced today could be used with DDR4 D-DIMMs and then be upgraded to use DDR5 D-DIMMs, possibly even without any host firmware changes. The interface technology used by the actual DRAM chips themselves becomes an internal implementation detail, in much the same way that your computer doesn't care whether your graphics card uses GDDR6 or HBM2.
Moreover, adoption of D-DIMMs is slated to increase memory bandwidth for two reasons. Firstly, since a multi-lane serial interface requires far fewer pins than a parallel interface, package constraints become far less of a factor in determining how many memory channels can be viably broken out from a CPU. Secondly, the amount of silicon footprint consumed by the number of serial lanes required to achieve a given memory bandwidth is expected to be much smaller than the amount of footprint needed to obtain the same amount of performance using a DDR4 memory controller, leaving more space to be allocated either to more serial memory lanes, or other functionality such as more cores. As I understand it, POWER10 (announcement slides here; recommended reading) is slated to have memory bandwidth equivalent to 16 DDR4 channels, while spending only a fraction of the silicon floorspace and package pins that would be required for 16 DDR4 controllers.
Another advantage of this approach is that because it decouples CPUs from the type of memory they are used with, this means people could exercise choice in the kind of memory they use. For example, you could use DDR4 D-DIMMs with your system; or you could use GDDR6 D-DIMMs for higher performance. The host system need not care. Until now this was not possible, because GDDR6 cannot be socketed (i.e., placed on a DIMM) and maintain adequate signal integrity to the host; but since with D-DIMMs the GDDR6 interface would be between the GDDR6 memory chips and the serialization chip on the same D-DIMM, this obstacle is removed. HBM2 D-DIMMs could presumably also be produced, though this would presumably require the HBM2 memory to be integrated into the same package as the serialization chip.
Moreover, consider that with the emergence of NVDIMMs there is growing demand to attach nonvolatile storage devices via RAM interfaces. If the number of NVDIMMs which can be installed is limited to the number of DIMM slots in a server (minus those used for DRAM), this is very limiting, especially when compared with the ability to attach arbitrarily many traditional nonvolatile storage devices via PCIe or SAS expanders. Because a differential interface can be retimed, switched and most likely transmitted over cables, this creates the possibility of having RAM expander chassis attached to a system via cables, just as PCIe and SAS devices can be today, potentially enabling arbitrarily large amounts of DRAM and/or NVDIMMs to be attached to a system.
IBM appears to be leading this industry move with their POWER10 announcement. Specifically, IBM has announced that their POWER10 CPU will have an OMI serial memory interface, not a DDR4 or DDR5 interface. OMI (OpenPOWER Memory Interface) is a multi-lane serial memory attachment standard which has been released by IBM as an open standard. Clearly in anticipation that people may want to attach arbitrarily large amounts of e.g. NVDIMM storage to their POWER10 systems, IBM has also announced that these POWER10 CPUs will support a physical memory address space of up to 2 PiB. I doubt you could physically fit 2 PiB of NVDIMMs in one server chassis, so it seems likely we'll see RAM expander chassis from IBM when we start seeing them ship POWER10 systems.
(Those familiar with IBM systems will know that OMI didn't come out of nowhere. In actuality, IBM has long had a habit of designing their servers with CPUs that do not attach to DRAM directly, but rather indirectly via a memory buffer chip, which IBM calls Centaurs. In fact, POWER9 was unusual as it marked the first time IBM shipped a POWER CPU supporting the direct attachment of DIMMs, whereas e.g. POWER8 requires all memory to be attached via Centaurs. Centaurs have the advantage of allowing more memory to be attached (the number of pins required for a DRAM channel limits the number which can be realistically fit onto a CPU package) and also provide a L4 cache. OMI, and the D-DIMM standard, which as far as I know directly descend from the Centaur design, thus form the current iteration, and the first non-proprietary iteration, of a long history of IBM experimentation with indirect memory attachment.)
Necessary implications for firmware. It should be noted that this decoupling of host CPUs from specific DRAM technologies would appear to necessarily imply that D-DIMMs will have their own firmware. Since the host doesn't know anything about the underlying memory technology used — which could be DDR4, DDR5, GDDR6, or even a memory technology invented after the manufacture of the system — the initialization and training routines for the memory chips and parallel memory controllers used inside the D-DIMMs must necessarily be handled by the D-DIMM itself. In short, RAM modules will cease to be “passive” devices.
This has serious implications when considered from the owner control perspective. In short, to free a system and ensure that all firmware running on it is open source software will now require RAM modules to be audited, replaced, or at least have their firmware reflashed. If D-DIMMs become commonplace or an industry standard as prevalent as DDR4 now is, those wishing to use blob-free systems may need to take even more trouble than they currently do. In the best case, vendors would ship open source firmware with their memory modules, but experience with such vendors suggests that such optimism is unlikely to be rational. In the second worst case, it may be down to individuals to replace firmware blobs via a long and arduous process of reverse engineering in order to have RAM without firmware blobs, if this even happens at all.
Moreover, because RAM modules will now have firmware, this means they could also be infected with malware; a decidedly unthrilling new frontier. Short of the CPU itself, there is probably no component in a system the compromise of which is less desirable, or more alarming. Even worse, the typical IT vendor is likely to respond to this security concern with “secure boot” functionality, which nine times out of ten will probably turn out to involve keyfusing chips to only run firmware signed by said vendor. I mentioned the second worst case above; the worst case is that not only do all RAM modules come to have proprietary firmware, but that they universally come to enforce vendor code signatures, too, meaning that they cannot be deblobbed even with any amount of effort. Moreover, in the wake of the devastating Solarwinds hack, in which a vendor was compromised precisely because of their control over their customers' systems, vendor-controlled keys have never been a greater liability to security.
Will Raptor ship a POWER10 system? Most people reading this article are probably aware of Raptor Computing Systems, a company producing POWER9 mainboards which are RYF-certified and free of any firmware blobs1.
Some comments were made recently by Raptor which have attracted attention:
“POWER10 just announced by IBM. Note we will NOT have POWER10 systems available in 2021, and cannot discuss further until P10 official release. POWER9 on Talos II / Blackbird will remain the best option for owner-controlled computing through at least 2022.” ([via Twitter](https://nitter.net/RaptorCompSys/status/1295364416469377026#m))
“We'd also like to point out that the delay is not on Raptor's side. That said, we have other interesting OpenPOWER HW in the pipleline [sic]...watch this space in 2021 for more info, and be sure to pick up a POWER9 system or @Integricloud VPS if you don't already have one!” ([via Twitter](https://nitter.net/RaptorCompSys/status/1295365443755155459#m))
On the POWER9-focused blog Talospace, a Raptor employee posted:
“There will probably be some exciting announcements for POWER hardware late this year / early next. Not POWER10 yet (IBM made some very poor choices regarding POWER10 that currently block our products and that we continue to work to resolve) but POWER overall is looking quite healthy for the future. For now, POWER9 is definitely the best way to go to get an open, owner-controlled, powerful system with long term support and tons of distro choices!”
“We're keeping it a bit under wraps at the moment while negotiations etc. are carried out, but suffice it to say any POWER10 systems from competitors in the interim will not meet the normal Raptor standards due to the causative IBM decisions.”
These comments have caused substantial discussion and speculation (Talospace, Phoronix). However, less attention has been drawn to some comments by an employee on the Raptor forums, which sheds substantially more light on the matter:
“Basically, POWER10 is an academic curiosity at the moment due to some really bad management-level decisions at IBM. It's not unrecoverable, but we do want to set expectations of POWER9 remaining the highest performing POWER product for the 2021 / early 2022 timeframe. If, and this is a fairly large "if", anyone else besides IBM ships a POWER10 system in that timeframe, the reasoning behind our decision should become quite clear.
The best way to help owner controlled computing at the moment is to keep buying and using POWER9. The other announcements we have (which are not IBM dependent) should be very exciting, but POWER9 will remain king in terms of raw performance for now.
TLDR: POWER10 is not off the table by any means; we have every intention of creating a POWER10 product line, but there are complex negotiations in play to reach the point where those POWER10 products will be up to our high open firmware / open systems standards. Such negotiations always take time, hence the delays, though COVID19 has stretched them out several times longer than normal such that public product announcements are now being affected. Buying from another vendor in the interim, even if one exists, will not help and you definitely won't like what you actually get in the end from another vendor in comparison to the normal Raptor standards.”
A further comment adds:
“Nothing is signed, but IBM has failed to release required firmware components as open source software. We can't go much further into details other than to say we're working the problem, but until it is resolved we will not be able to manufacture a POWER10 system, nor would we recommend usage of the POWER10 processor.
There's another area where IBM made a very poor choice, but that will become more obvious once IBM releases more data on their own systems using the POWER10 device. That issue is also being worked by Raptor.”
In other words, the nature of the problem is that firmware blobs have somehow cropped into the POWER10 design; this would definitely preclude fulfillment of “the normal Raptor standards”, and explains Raptor's current position. The only question that remains is: what are the nature of these blobs?
Investigation. Many have hypothesised that the answer to this question lies in POWER10's transition to the use of OMI. I believe I can now corroborate this hypothesis:
There is an OpenPOWER repository on GitHub, open-power/ocmb-explorer-fw, which has little to explain it.
Its README states:
“Source code for firmware running on Explorer OCMB
This repository contains the source code for the firmware that executes on the microprocessor embedded with the Explorer OMI Connected Memory Buffer (OCMB) chip.
Note that a full binary cannot be generated from this source. The corresponding binary image can be found in the release section.”
This is clearly talking about the firmware for an OMI chip as might be found on a D-DIMM. However, contrary to the README above, the repository contains no source code. Aside from the above README, there is an Apache 2.0 licence file and a file called “README-LICENSE-Microchip PM8596 Explorer Firmware_v3.pdf”.
PM8596 is the name of an OMI interface chip produced by Microchip. In fact, as far as I can tell, it is the only OMI interface chip being produced by anyone, anywhere. My strong suspicion is that IBM basically transferred its Centaur IP to Microchip to have it produce this part and try and bootstrap an OMI ecosystem beyond just IBM.
Since it appears POWER10 is to be OMI-only in terms of RAM attachment (see the POWER10 announcement slides), we can probably infer that this is the only chip in existence which can be used to create POWER10-compatible memory modules (D-DIMMs).
Moreover, the licence file mentions something very interesting:
“Synopsys DDR Firmware
Licensee also agrees not to modify the Synopsys DDR Firmware. The binary Synopsys DDR Firmware may be used only in connection with Microchip integrated circuits that implement the Synopsys DDR PHY IP. Licensee will maintain the copyright, trademark, and other notices that appear on the Synopsys DDR Firmware, and reproduce such notices on all copies of the Synopsys DDR Firmware.”
This seems to me to be a smoking gun.
If you are familiar with work done in the FOSS community to procure devices with fully open firmware, the name Synopsys may ring a bell here. Synopsys is a company which sells IP blocks2. One of their IP block products is a DDR4 memory controller and PHY solution, which appears popular and which is found in a number of SoCs. It is however a known fact that Synopsys's DDR4 PHY requires a Synopsys-issued firmware blob to operate; therefore, chips which use the Synopsys DDR4 PHY inevitably have firmware blobs in their boot path3.
Conclusions. From the above it seems to me all but confirmed that:
- POWER10 must be used with the Microchip PM8596.
- The Microchip PM8596 uses a Synopsys DDR PHY to interface with DRAM.
- Initializing a Synopsys DDR PHY requires the use of a Synopsys-issued firmware blob.
In short, this has grave consequences for the viability of POWER10 as an owner-controlled, blob-free platform: it seems basically confirmed at this point that use of POWER10 would require use of these blobs, meaning that POWER10 now inherits the same firmware blob issues that plague many other platforms.
It is true that Microchip's product page for the PM8596 lists “open source firmware” among its bullet points, but this doesn't quite add up to me. Microchip would have no right to publish the source code of Synopsys's firmware blobs, and as I understand it licencees of Synopsys's DDR PHY do not receive source code to the blob anyway. To my knowledge, previous attempts to reason with Synopsys on this have proven wholly futile. My guess is if any “open source” code does materialise for the PM8596, it will contain the Synopsys firmware blob. (Moreover, the fact that the above GitHub repository has yet to receive any source at all, but rather simply posts binaries in the releases section, is not encouraging.)
I think the information above is pretty conclusive, and in my view this is most likely what explains Raptor's current position. Being that the core premise of Raptor is to ship systems with 100% open-source, auditable firmware, any firmware blob involved in the boot process would be a dealbreaker in terms of being able to ship POWER10, let alone one involved in something as security-sensitive as DRAM.
1. Aside from Broadcom NIC firmware, which is behind the IOMMU and in any case after extensive adventures in reverse engineering now has a working open source replacement. Raptor is currently testing this firmware (and as I understand it, is expected to switch to shipping all new systems with this firmware once they have done so). ⏎
2. “IP blocks” is a term used in the silicon industry to mean a licencable component of a silicon chip; it is essentially analogous to the term “library” in the software industry. In other words, Synopsys sells ready-made components which can be integrated into a chip design. ⏎
3. Probably the most notable SoC containing a Synopsys DDR4 memory controller and PHY is the NXP i.MX8M series, which was for example selected by Purism to form the heart of their Librem 5 phone. This was a strange choice for a product intending to have fully open source firmware, due to the aforementioned Synopsys firmware blob. Purism's response to this problem was an [astonishing blog post](https://puri.sm/posts/librem5-solving-the-first-fsf-ryf-hurdle/) in which it openly boasted about how it was going to game the FSF's RYF criteria to be able to ship this blob.
Someone did inform me informally that the RYF program intends to have none of this and will not certify the Librem 5, but this is just a rumour. Make of it what you will. ⏎