With DirectX 12 coming soon with Windows 10, VR technology ramping up from multiple vendors, and the Vulkan API already debuted, it’s an exceedingly interesting time to be in PC gaming. AMD’s GCN architecture is three years old at this point, but certain features baked into the chips at launch (and expanded with Hawaii in 2013) are only now coming into their own, thanks to the improvements ushered in by next-generation APIs.
One of the critical technologies underpinning this argument is the Asynchronous Command Engine (ACEs) that are part of every GCN-class video card. The original HD 7900 family had two ACE’s per GPU, while AMD’s Hawaii-class hardware bumped that even further, to eight.
AMD’s Graphics Core Next (GCN) GPUs are capable of asynchronous execution to some degree, as are Nvidia GPUs based on the GTX 900 “Maxwell” family. Previous Nvidia cards like Kepler and even the GTX Titan were not.
What’s an Asynchronous Command Engine?
The ACE units inside AMD’s GCN architecture are designed for flexibility. The chart below explains the difference — instead of being forced to execute a single queue via pre-determined order, even when it makes no sense to do so, tasks from different queues can be scheduled and completed independently. This gives the GPU some limited ability to execute tasks out-of-order — if the GPU knows that a time-sensitive operation that only needs 10ns of compute time is in the queue alongside a long memory copy that isn’t particularly time sensitive, but will take 100,000ns, it can pull the short task, complete it, and then run the longer operation.
The point of using ACE’s is that they allow the GPU to process and execute multiple command streams in parallel. In DirectX11, this capability wasn’t really accessible — the API was heavily abstracted, and multiple developers have told us that multi-threading support in DX11 was essentially broken from Day 1. As a result, there’s been no real way to tell the graphics card to handle graphics and compute in the same workload.
AMD’s original GCN hardware may have debuted with just two ACEs, but AMD claims that it added six ACE units to Hawaii as part of a forward-looking plan, knowing that the hardware would one day be useful. That’s precisely the sort of thing you’d expect a company to say, but there’s some objective evidence that Team Red is being honest. Back when GCN and Nvidia’s Kepler were going head to head, it quickly became apparent that while the two companies were neck and neck in gaming, AMD’s GCN was far more powerful than Nvidia’s GK104 and GK110 in many GPGPU workloads. The comparison was particularly lopsided in cryptocurrency mining, where AMD cards were able to shred Nvidia hardware thanks to a more powerful compute engine and support for some SHA-1 functions in hardware.
When AMD built Kaveri and the SoCs for the PS4 and Xbox One, it included eight ACEs in those chips as well. The thinking behind that move was that adding more asynchronous compute capability would allow programmers to use the GPU’s computational horsepower more effectively. Physics and certain other types of in-game calculations, including some of the work that’s done in virtual reality simulation, can be handled in the background.
AMD’s argument is that with DX12 (and Mantle / Vulkan), developers can finally use these engines to their full potential. In the image above, the top pipeline is the DX11 method of doing things, in which work is mostly being handled serially. The bottom image is the DX12 methodology.
Whether programmers will take advantage of these specific AMD capabilities is an open question, but the fact that both the PS4 and Xbox one have a full set of ACEs to work with suggests that they may. If developers are writing the code to execute on GCN hardware already, moving that support over to DX12 and Windows 10 is no big deal.
Right now, AMD has only released information on the PS4’s use of asynchronous shaders, but that doesn’t mean the Xbox One can’t. It’s possible that the DX12 API push that Microsoft is planning for that console will add the capability.
AMD is also pushing ACE’s as a major feature for its LiquidVR platform — a fundamental capability that it claims will give Radeon cards an edge over their Nvidia counterparts. We’ll need to see final hardware and drivers before making any such conclusions, of course, but the compute capabilities of the company’s cards are well established. It’s worth noting that while AMD did have an advantage in this area over Kepler, which had only one compute and one graphics pipeline, Maxwell has one graphics pipeline and 32 compute pipes, compared to just 8 AMD ACEs. Whether this impacts performance or not in shipping titles is something we’ll only be able to answer once DX12 games that specifically use these features are in-market.
The question, from the end-user perspective, obviously boils down to which company is going to offer better performance (or price/performance ratio) in the next-generation DX12 API. It’s far too early to make a determination on that front — recent 3DMark 12 benchmarks put AMD’s R9 290X out in front of Nvidia’s GTX 980, while Star Swarm results from earlier this year reversed that result.
What is clear is that DX12 and Vulkan are reinventing 3D APIs and, by extension, game development in ways we haven’t seen in years. The new capabilities of these frameworks are set to improve everything from multi-GPU configurations to VR displays. Toss in features like 4K monitors and FreeSync / G-Sync support, and it’s an exciting time for the PC gaming industry.