Just as we’ve seen some interesting new CPU ideas coming out of China, some of which are clearly intended to chase Intel at least in the long term, the old guard is shipping the last examples of its once-vaunted pretender: Intel is no longer distributing its Itanium series CPUs.
In a practical sense, the end came some time ago. The last Itaniums were the 9000 series codenamed Kittson, which was rumoured as far back as 2012 and was announced officially in 2017. The fact that it was a shipping product until recently – the best part of five years after launch, and ten years after it was thought of – suggests that Itanium had long been lagging behind the otherwise brisk CPU development of the 2010s. Really, though the writing had been on the wall for even longer than that. HP had been Intel’s only major customer for Itanium for a while, and even before Kittson, in 2011, lack of confidence in Itanium was such that HP had to sue Oracle into upholding an agreement the companies had made that Oracle would continue supporting the CPU.
Itanium existed for much the same reasons as reduced instruction set CPUs: to head off an anticipated performance bottleneck in CPU design. The ideas behind Itanium were based on research done by HP in the late 80s. While RISC tries to make the CPU simpler and therefore faster, the ideas which spawned Itanium were called EPIC, for Explicitly Parallel Instruction Computing, in which more than one instruction would be executed at once. If this sounds a lot like the modern approach of multi-core CPUs, well, it works at a much more fundamental level, though it has at least some of the same theoretical concerns.
The problem with doing lots of things at once is that the work done by one instruction might depend on the outcome of a previous instruction. Often that’s easy to determine. For instance, lots of software contains loops, to be repeated a certain number of times. To do that, the computer will keep adding one to a counter representing the number of times the code has looped, then run whatever code is inside the loop. It will then compare the counter to the desired number of loops, and loop again if it needs to. In many modern programming languages, it looks something like this, where we multiply numbers a and b together by adding b to an accumulator a times, and count the loops with the number i:
i = 0;
a = 10;
b = 5;
result = 0;
for(i; i < a; i = i + 1){
result = result + b;
}
In this example, we multiply 10 by 5, and the result ends up being 50.
This means that every time we run the loop, we know for a fact that we’ll have to add one to the loop counter (i = i + 1) and then compare it with the desired number of loops (i < a). Ideally, we could actually do that addition and comparison at the same time as we’re running the code inside the loop. That’s fine, until the code inside the loop decides, for example, to modify the number of times the loop has to run, by changing a or i. Yes, coders, I know it’s bad form to do that, but it’s just an example. At that point we can’t check i < a while the loop code is running, because we don’t know what the value of a or i will be.
For the approach proposed by EPIC to work, this sort of situation has to be detected and handled appropriately when the software is being written. For any modern CPU, that’s part of the job of a compiler, the piece of software that turns the somewhat-understandable language we see in the example above, written by a human, into a list of processor instructions, which looks like a column of very large numbers. The example we’ve used here is very simple, but in real-world software this sort of thing is notoriously difficult, with lots of work going into the theoretical computer science of analysing the flow of code.
Whether this issue is really what made Itanium a lame duck is debatable. The great software professor Donald Knuth once described that sort of compiler as “basically impossible to write,” although it’s worth pointing out that the quote comes from a statement in which Knuth speaks out stridently against the concept of multi-core processors in general. Of course, EPIC and Itanium are not the same thing as a multi-core Intel 64 or ARM and the problems aren’t exactly the same, but there are some points of comparison and a lot of people would argue with Knuth’s wider analysis. Certainly, though, Itanium was complicated to make and compiling efficient code proved difficult.
The result was that the first Itanium limped to market in 2001, three years late and a let-down in terms of performance. While subsequent models improved things hugely and there were Itanium workstations from the likes of SGI to augment the core server market, the biggest competitor to Intel’s Itanium was often Intel’s x86, which didn’t make much sense. When AMD released the Opteron in 2003, bringing the 64-bit x86-64 instruction set to desktops and workstations while maintaining backward-compatibility with 32-bit x86 code, Intel was more or less forced to introduce the compatible Intel 64 extension to 2004’s Xeon releases. Itanium spent the rest of its life struggling to differentiate itself meaningfully from big Xeons, and by 2008 HP had to pay Intel nearly half a billion dollars to keep making CPUs for Itanium servers.
It’s hard to take any pleasure in something ending; there’s a grain of truth in the suggestion that Intel x86 architecture was perhaps not an ideal long-term bet. Whether Itanium was ever a realistic answer to that problem is obscured behind the performance problems and the way it was marketed; certainly, now, we’ll never know.