Monday, 26 August 2013

Setting HSAIL: AMD explains the future of CPU/GPU cooperation

AMD HSAIL


AMD’s heterogeneous system architecture (HSA) initiative has been a steady interest since the company first started talking about “Fusion” processors in 2007. Today, at the international Hot Chips computing technology conference, the company gave a talk that laid out details behind what its HSA Foundation has designed and the language that powers the technology, dubbed HSAIL (HSA Intermediate Language).
It’s best to start with a basic overview of the problem. Despite the popularity of OpenCL and Nvidia’s direct investment of hundreds of millions of dollars into its Tesla products and CUDA software, the actual task of moving work from CPU to GPU, performing it, and bringing it back again is still a giant headache. The simplest explanation for the problem is this: For most of history, the trend in computing was to move tasks to the Central Processing Unit, which would then perform them. Gaming is virtually the only workload that has resisted this tendency (moving a GPU on-die is not the same thing as programming a game to run on the CPU).
SoC design
After decades of moving workloads towards the CPU, designing a system that moves them back out again and makes the GPU an equal partner is a complex undertaking. Thehardware side of HSA compatibility addresses this problem by specifying a number of capabilities that a combined CPU-GPU system must have in order to leverage heterogeneous compute. The CPU and GPU must share a common set of page table entries, they must allow both CPU and GPU to page fault (and use the same address space), the system must be able to queue commands for execution on the GPU without requiring the OS kernel to perform the task, the GPU must be capable of switching tasks independently, and both devices must be capable of addressing the same coherent block of memory.
HSAIL is designed to address the software side of the equation.

HSAIL: It’s not an API

This is a big enough point of confusion that I want to address it in early on. HSAIL is an intermediate translation language that’s created at runtime and mapped to the hardware vendor’s ISA. It’s the secret sauce that allows multiple vendors like Imagination, ARM, AMD, and Qualcomm to benefit from the technology, even though they each have very different GPU hardware. The idea is that you write code in your language of choice (C++, AMP, OpenCL, Java, or Python are all listed) and that code is then compiled to target HSAIL and run on whatever GPU is integrated into the system.
HSA Stack
The advantage to HSAIL, according to AMD, is that it won’t require programmers to learn whole new languages. If you’re familiar with OpenCL, use OpenCL. There still may be some overlap between HSAIL capabilities and some of what OpenCL 2.0 supports, but HSAIL is explicitly designed to simplify programming for GPUs in some critical ways. It also opens up the possibility of accelerating languages like Java on the GPU, though again, this does requires that Java itself be capable of mapping well to a graphics card. This hydra has a number of heads.
The central idea, as shown above, is that an HSAIL-capable hardware block doesn’t need to be x86-compatible, or based on GCN, or tied to any other specific architecture. That means Imagination can run code just as well as Qualcomm, at least provided that each company does its own driver writing. Still, the burden isn’t sitting on the programmer for this one, and that’s a major advantage.

What about gaming?

This is a rather complex question. What we call “gaming” is exactly an incredibly complex flow of data between CPU, GPU, main memory, and attached storage. CPUs and GPUs have always had to communicate, but for most of history, that communication has been asynchronous and fast in only one direction. Historically, CPU-GPU communication has been rather lopsided, as the chart below demonstrates.
AMD Llano bandwidth
That chart shows Llano’s bandwidth for CPU and GPU when accessing various types of memory. It’s lopsided because it reflects the imbalance between typical CPU and GPU bandwidths to various parts of main memory. These lopsided connections can be beefed up, and latency can be reduced — the point of showing them here is to illustrate that this broadly illustrates the status quo that developers have been used to working with for decades. Games have historically been designed to run well in a particular type of configuration. HSA has the potential to change that, but software development always lags hardware.
CPU - GPU collaboration
That doesn’t mean HSA won’t be important or that it can’t boost game performance. One of the areas AMD highlights is that historically, while GPUs have been used to accelerate and improve game physics, most of that processing was strictly cosmetic. Nvidia’s PhysX allows for gorgeous displays of additional eye candy, but that eye candy didn’t impact the actual game. One of the things AMD highlights in its presentation is that in-game physics is a compute problem — and HSA can conceivably be leveraged to create much stronger experiences.
There are going to be benefits to HSA in gaming, but data suggests that these benefits may take time to emerge — physics engines have to be designed to pass data back and forth, HSAIL needs to ship, and moving to a new programming model is going to take time. For now, the focus is on using HSAIL for compute tasks, which is why most of the companies that have announced HSA support are focused on high performance computing.
Making GPUs more programmable and useable has implications for mobile and for complicated tasks like facial recognition and natural language processing. AMD’s goal with HSAIL and HSA is to provide a common framework that can accelerate a number of tasks, but the road to gaming use may be more complex than for other areas, where CPUs and GPUs have virtually no history of data sharing and the goal is to allow the GPU to be leveraged in the first place.

0 comments:

Post a Comment