This month’s CES saw the introduction of max speed DDR5 memory from SK Hynix. Micron and other vendors are also reportedly sampling similar devices. You can’t get them through normal channels yet, but since you also can’t get motherboards that take them, that’s not a big problem. We hear Intel’s Xeon Sapphire Rapids will be among the first boards to take advantage of the new technology. But that begs the question: what is it?
Broadly speaking, there are two primary contenders for a system that needs RAM: static and dynamic. There are newer technologies like FeRAM and MRAM, but the classic choice is between static and dynamic. Static RAM is just a bunch of flip flops, one for each bit. That’s easy because you set it and forget it. Then later you read it. It can also be very fast. The problem is a flip flop usually takes at least four transistors, and often as many as six, so there’s only so many of them you can pack into a certain area. Power consumption is often high, too, although modern devices can do pretty well.
So while static memory is popular in single board computers and small devices, a PC or a server will not be able to pack gigabytes of static memory. Dynamic memory just uses a little capacitor to store each bit. You still need a transistor to gate the capacitor on and off some common bus, but you can pack them in. Unfortunately, there’s a big problem: the capacitors discharge pretty quickly. You have to devise some way to refresh the memory periodically or it forgets. For example, a typical DDR2 module needs a refresh every 64 milliseconds.
Real devices use rows and columns of capacitors to maximize space and also to allow the refresh to occur on an entire row. That means a device that has, say 4096 rows would need a refresh every 15.6 microseconds so that each row retains its data. The refresh itself takes just a few nanoseconds.
In a typical array, there is a row and column bus. A capacitor connects to an FET that can switch the capacitor on or off the column bus. The gate of the FET connects to the row bus. A row signal selects an entire row of FETs. The long column bus has some capacitance and resistance, so it takes a little precharge time before the signal is stable and then a multiplexer picks the bit off the correct column. Writing is just the reverse. You can play with a simulated line of dynamic memory if you want to get a feel for it.
That’s dynamic RAM or DRAM. But what about SDRAM? SDRAM is a dynamic RAM with an asynchronous interface to a memory controller. The controller allows you to stack up several commands at once and the controller handles all the logic of handing the rows and columns and can even do the refresh step automatically. The controller buffers both commands and data to achieve higher bandwidth than is possible with many other technologies.
SDRAM goes back to 1992, and by 2000 had driven most other forms of DRAM out of the market. JEDEC, an industry group, standardized the interface for SDRAM in 1993, so you generally don’t have problems using different brands of memory.
Normal SDRAM can accept one command and transfer one word of data per clock cycle. Eventually, JEDEC defined a double data rate or DDR standard. This still accepts one command per cycle but reads or writes two words in that same clock cycle. It can do so because it transfers one data word on the rising edge of the synchronous clock and the other at the falling edge. In practice, this means the internal read on one command is two words and this allows the internal clock to be less than the I/O clock. So if the I/O clock is 200 MHz, the internal clock could be 100 MHz, but the data transfer will still be two words for every I/O clock.
This worked so well, they invented DDR2, which reorganized the RAM to be four words wide internally, and then sends or receives four words in a burst. Of course, the clock speed didn’t change, so you increase latency. DDR3 doubled the size of internal data again, with the corresponding increase in latency.
DDR4 was a departure. It did not double the internal memory bus but interleaved access between internal memory banks to increase throughput. A lower voltage also allows higher clock speeds. DDR4 appeared around 2012 although it didn’t reach critical mass until 2015.
It sounds like memory bandwidth is on the rise, right? Sort of. The increase in bandwidth has tracked — more or less — with the rise in multicore processors. So while the raw bandwidth has increased, the bandwidth per core in a typical machine hasn’t changed much in a long time. In fact, with the rapid expansion of core count in a typical CPU, the average is in decline. So it is time for a new standard.
Now we have DDR5, defined in 2017. According to reports, a DDR5-3200 SDRAM will have 1.36 times the bandwidth of a DDR4-3200 and could go higher. We hear, too, that the prefetch size will double again, at least optionally.
SDR 1.6 GB/s 3.3 1 1993
DDR 3.2 GB/s 2.5 2 2000
DDR2 8.5 GB/s 1.8 4 2003
DDR3 8.5 GB/s 1.8 8 2007
DDR4 25.6 GB/s 1.2 8 2017
DDR5 32 GB/s 1.1 8/16 2019
As you can see from the above table bandwidth from the original SDR memory is up by a factor of 20 over 26 years. Not bad. The 16 word prefetch is especially interesting since that will allow the chip to fill a typical PC’s cache in one fetch.
There are some other advantages, too. For example, if you’ve ever tried to interface an SDRAM with your circuitry or FPGA design, you’ll appreciate the loopback mode. If you are a memory hog, the top size for modules is now 64 GB, up from 16 GB.
By the way, there is an LP-DDR5 spec which is a low power version for things like cellphones. That specification was released last year and we haven’t heard a big rush for products in that area yet. The LP-DDR4 spec allowed two frequency scaling points so you could trade speed with power. LP-DDR5 has three different possible settings. Then there are the GDDR standards — up to GDDR6 last time we checked — for graphics processing and other high-performance application. For perspective, LP-DDR5 clocks in with 6.4 Gb/s bandwidth per I/O bit, while GDDR6 boasts hundreds of GB/s depending on the bit width.
SO WHAT NOW?
Unless you are running a busy server or something else that loads up all the cores of your CPU, you aren’t going to feel much practical difference between DDR4 and DDR5. But then again, who doesn’t want good benchmarks?
Besides, for a typical workstation load, the real trick is to have enough RAM that you don’t need to hit the disk very often. This is especially true if you have a rotating disk platter which is notoriously slow. The time spent reading and writing RAM turns out to not be the long tent pole in your real-world performance. With solid-state disks, this isn’t as bad as it used to be, but typical solid-state drive throughputs are just a bit faster than DDR3, even though faster drives are probably on the horizon. So unless you are doing very intense multicore workloads, you are probably better off having 32 GB of DDR3 than 4 GB of DDR5 because the larger memory will keep you from having even slower operations as often.