HDR and the dwindling technical superiority of PC gaming

There has been a rather interesting shift over the past few years that has gone largely unnoticed by tech and gaming enthusiasts: I don’t believe that PC gaming represents a clearly superior technical proposition compared to console gaming anymore.

Three big changes took place to alter the technical gaming landscape: HDR, W-OLED, and the Xbox One X. To make my case, I will first elaborate about each of these.

HDR is deeply underappreciated

High dynamic range display essentially consists of two components: greater contrast, and new math. ST 2084 is the SMPTE’s standardized form of PQ, Dolby’s perceptual quantizer, an electro-optical transfer function (EOTF) that maps video levels to light output and replaces traditional gamma. If you don’t know what any of these words mean, but would like to learn more, see Dolby’s white paper. In short, this function encodes color data far more efficiently than gamma ever could across a wider luminance range and takes into account real-world modeling of human perception, allowing a larger spectrum of colors to be perceived from an otherwise identical source image.

To reduce our focus to the most significant problem that HDR addresses: content such as movies and video games has been effectively constrained to 100 nits of maximum luminance for several decades. Considering that simply walking outside will showcase a distribution of luminance values well into the thousands of nits for reflected surfaces that are perfectly safe to observe, you can begin to appreciate how severely this historical constraint has restricted all of our media and computing interfaces. To quantify the benefits of high dynamic range in combination with support for wider color gamuts on modern displays, color volume is one measure for expressing the expanded range of colors able to be realized.

In other words, HDR provides much more than a leap in contrast alone. Since HD and the “Retina” era of high-DPI displays, HDR offers the single biggest individual improvement to image quality. And it does not apply to video alone, but to any digital display output, including photos and application UI.*

PC monitors have fallen behind for good

There is however a tragic downside to the HDR era: it’s currently impossible on the PC, and will be for years. This comes down to two factors: power and OLED economics.

Over the past half-decade, OLED TVs quickly superseded plasma TVs in almost all technical regards, thanks to key (heavily patented) technologies that LG Display was able to bring to market with its W-OLED IGZO panels. Briefly, W-OLED displays utilize an RGBW pixel structure, with white subpixels that allow for high light emission in conjunction with filters to convert the colors for final output.

W-OLED technology provides much greater luminance and efficiency, decreases the risk of burn-in, and altogether greatly improves panel yields for production. The alignment precision required for patterning is far easier to achieve than for standard OLED fabrication, enabling the production economics to be viable for even very large panel sizes.

While there was some early hope of bringing good OLED (and even IGZO OLED) panels to laptops, those attempts effectively ended in failure. The economics simply do not work. Worse, it is brutally hard to increase max luminance for monitors compared to for TVs, as TVs are able to draw far greater amounts of power.

That sadly hasn’t stopped a recent sham push by monitor vendors and VESA to peddle “HDR” PC displays to consumers, but I assure you every single last one of these products is effectively a fake, and the overall marketing effort is stupid nonsense.

Meanwhile, the console space has seen widespread HDR adoption by developers, often to magnificent effect. Despite most games doing HDR poorly in various ways, it’s generally a massive improvement to image quality, one that happens to be effectively free from a performance perspective. It’s glorious.

Microsoft has changed the game

With the Xbox One X, Microsoft has pushed like hell to re-establish its technical dominance over Sony, and it has hugely succeeded in doing so. Despite having a CPU microarchitecture that would best be described as horrendously uncompetitive, the GTX 1070-level of graphics, abundant memory bandwidth, and fixed hardware target with low-level developer access of the One X have combined to set a new benchmark in extracting pure pixel performance from a console. Locked 4K30 and even 4K60 on the One X are far less uncommon than you might think.

I’ve watched hundreds of hours of Digital Foundry and other performance analysis videos, and the One X games to date almost always perform outstandingly well. For the cases with consistent performance drops, all the games nonetheless support variable refresh and are future-proof for running with greatly improved performance on the next Xbox. You basically can’t lose.

This is because Microsoft’s vGPU mastery has brought one of the key advantages of the Windows PC platform, incredible backwards compatibility support in software, to the console space for the first time. And its amazing efforts haven’t stopped there. (If you’re wondering why any company would go to such lengths to resurrect old, existing games, it’s because game preservation is an enormous problem.)

Despite using a fairly terrible SoC, the Switch to an extent has also demonstrated the ever-shrinking gap between the performance of cost-optimized chipsets and silicon designed for powerful PCs. Mark my words, with its ARM efficiency advantage, Nintendo is well-poised to embarrass Microsoft and Sony on hardware technical capabilities over the coming years. Though the less constrained GPU die sizes, memory bandwidth, and especially bill of materials of the latter two companies’ consoles will continue to ensure advantages over Nintendo’s specs, Nintendo will trend ever closer to the x86 platforms in raw performance.

The PC’s remaining performance advantages

Some factors influencing graphics fidelity remain unchanged. Brute-force high quality anti-aliasing is still a major advantage in favor of the PC. 120fps console gaming is currently limited to a literal handful of Xbox One X games that require variable refresh to eliminate stutter. And a precious minority of PC games such as Battlefield V really do still take advantage of PC hardware superiority these days.

You will also always be able to drop huge sums of money on the best-binned, heavily overclocked CPU, with a completely overkill dual-GPU setup and liquid cooling to keep a sky-high power budget in check. This will only get more and more expensive and wasteful, however, with the recent death of Moore’s Law. (It’s not really the lack of competition that has NVIDIA’s GPU pricing soaring through the roof.)

Finally, I won’t neglect to acknowledge that the PC has recently garnered one hell of a trick in its favor: real-time ray tracing. The current, very early, best case for it is demonstrated by Battlefield V, which can just run at 1440p60 on an ~$800 GPU. Even though it’s early days, the ray tracing revolution has indeed already begun.

Weighing the tradeoffs for the gamer

But I don’t believe higher average frame rates and Ultra settings that are pretty poor on performance efficiency are worth quite as much as commonly held. I do however think the advantages of generally better overall image quality thanks to HDR and OLED panels, more consistent frame-times (through micro-optimization), and little required frustration toiling with graphics and motherboard settings heavily weigh the accessibility of high-end graphics fidelity towards console gaming for the overwhelming majority of consumers.

If you truly value stable performance, it’s almost sufficient to note that optimizing settings to get a modern AAA PC game to run with extremely consistent frame delivery is generally an *enormous* pain requiring a great deal of research, and even then Windows will try to ensure you misery. Heaven forbid you use a workstation CPU with NUMA.

(There’s also more that can be said about the immaturity of Direct3D 12 and Vulkan drivers, and how wildly behind Intel and AMD are competitively on efficiency and hardware platform features compared to the ARM players, but these are secondary matters.)

For any PC gamers who think my overall argument is nonsense, genuinely, show me your frame-time data on a non-monster setup. I’m quite confident that 99+% of PC gamers are not playing AAA games at nearly locked frame rate multiples of 60, but I’m always open to more comprehensive data analysis.

(PC gaming pro-tip: always set a frame rate cap of exactly 120, 60, or 30fps, or do whatever trick you need to to achieve the equivalent with VSync on.)

Lastly, if you haven’t seen it yet and are interested in this topic, take a look at my console gaming optimization reference guide. I put quite a lot of free time into it.

* How well even tech enthusiasts can distinguish HDR from non-HDR content on mobile devices when not displayed side-by-side is another lengthy topic.

Oculus Research’s presentation at Display Week 2018

Douglas Lanman of Oculus Research gave an awesome talk about reactive displays at the Society for Information Display’s Display Week 2018. Watch it if you want to learn why the future of displays is VR and wearable technology.

MLPerf benchmark suite for machine learning announced

This is awesome to see. The state of deep learning benchmarking is still dreadful, and I think most observers don't know even the most basic details about how it should be done properly. For more, see Wave Computing's press release.

Unfortunately there are some big names not yet listed among the supporters. That includes NVIDIA, unsurprisingly.

Update

NVIDIA (and many, many other companies) joined later, thankfully.

About Moore’s Law — it’s dead

I've been waiting for someone of sufficient stature to publicly convey this. If you’re not sure what all this means, look at the graph.

While Intel’s 10nm was the canary in the coal mine, it has taken a couple years for the industry to fully grasp the sheer wall it has hit, and how the other foundries would hit it just the same. Cannon Lake’s extreme delay and Apple’s middling A10X and A11 single-threaded performance improvements (despite what it did with the latter's core) were leading indicators.

We're still getting shrinks, but they aren't timely enough to double transistor count every two years anymore.

While there are other areas that can be advanced, we really need materials breakthroughs to be able to push per-core performance again. Until then, we’re mostly stuck.

"Spectre/Meltdown Pits Transparency Against Liability: Which is More Important to You?"

All hardware is degrees of broken. I've unfortunately found, however, that many vendors are happy to advertise their silicon as fully functional despite shipping broken implementations or disabling IP or features outright.

And in the case of the Meltdown and Spectre vulnerabilities, most modern CPUs were deliberately designed with what ultimately proved to be a poor balance between security and performance in regards to their speculative execution implementations.

"HiSilicon Kirin 970 — Android SoC Power & Performance Overview"

If you want to learn about the state of mobile chipsets, AnandTech’s power and performance overview of HiSilicon’s Kirin 970 is a good place to start. It’s not intended to be a comprehensive overview of the SoC, but it’s also easier to read than such a piece.

While you may not care about Huawei or Chinese silicon vendors — you should! — it’s important to follow HiSilicon’s SoC implementations and results, as they among others serve as a barometer of the state of the ARM ecosystem and all mobile SoCs. The company's best effort to date was Kirin 950, which was a very well-implemented chipset.

Real analyses also can and should convey the things that really matter when it comes to SoCs: interconnect implementation, memory access latency, whether the power management works at all, etc. These are some of the things that most often go wrong, or are the most challenging to implement well. The average mobile observer probably only thinks about CPUs and GPUs, but they’re not even remotely the only things that matter.

Andrei is the only person writing publicly who knows how to measure power properly at the rails (or even publish fuel gauge figures), so you can trust these power figures, unlike everything else you may find on the internet. Additionally, his figures provide a good overview and recap of the state of mobile chipsets in recent years.

This is also the best public data we have on 10nm at the moment, and the results echo what I could gather about the node early on (see: Updates). I particularly appreciate his compiling SPEC2K6 for readers' benefit, since that’s a genuine pain to do.

“An increase in main memory latency from just 80ns to 115ns (random access within access window) can have dramatic effects on many of the more memory access sensitive tests in SPEC CPU. Meanwhile the same handicap essentially has no effect on the GeekBench 4 single-threaded scores and only marginal effect on some subtests of the multi-threaded scores.”

This section as well as the other commentary on benchmarks should sound very familiar to subscribers. SPEC is not exactly the best benchmark in the world in terms of real-world representativeness (read: understatement), but it’s the best we’re going to get publicly.

I should note that interpreting benchmark results beyond the basics is tough. You need to really know what’s going on on a low level. Deep learning benchmarking is also complicated, and I’m not a machine learning researcher so I’ll refrain from commenting about that.

The upcoming SoC to watch right now is Exynos 9810. Samsung’s System LSI division really needs to deliver following recent disappointments that failed to live up to the solid Exynos 7420.

Lastly, if you’re hoping for greatness from 7nm, I would argue that it would probably be better to start accepting that Moore’s Law is dead. I’m not the person to write about that, though.

About Fuchsia

I've mentioned this before for subscribers, but I may as well say it here: it seems obvious to me that we'll see Fuchsia/Zircon devices this year.

Think months, not years.

David Kanter on Intel's 22FFL process

I heard Intel say many promising things about 22FFL at TechCon 2017, but it has quite a lot to prove when it comes to SoC processes and winning clients for Custom Foundry.

In-depth public analysis of foundry technology is rare, so I'm grateful that David has written about this important topic.

Treble and Fuchsia

If you had to combine Android and Fuchsia, how would you do it?

This article is available for subscribers on Patreon.

Ending subscriptions

I have decided to end subscriptions for Tech Specs. Existing subscribers for this month will be refunded at the end of the month, and all subscriber articles will still be accessible until the end of the month.

Tech Specs will not necessarily end as a blog, but as of now I don’t know what its future will hold. I want to reenter the tech industry for full-time work, and understandably employers generally don’t allow employees to blog. If I can continue to write in some capacity, I will, but I hope you will understand that it is unlikely to be at the same level of depth or weekly commitment.

I deeply, deeply appreciate the support of all my subscribers to date. If not for your support, I would have stopped blogging many months ago, and I honestly kept going as long as I could justify. The comments and questions on subscriber articles were especially great, so my thanks for all of those. Please do feel free to continue sending me any questions or comments via email or Twitter. I always try to respond eventually.

I know $10 a month was not cheap, and I debated launching new pricing options for many months. Ultimately, unless offering cheaper options would have increased the number of subscribers by an enormous multiple, it would have done very little to move the needle. There was no realistic path towards being able to justify the continued time and especially financial expense. I do feel like I could have eventually made the numbers work, at great effort, though it would have taken many years. That kind of time is unfortunately something I do not have.

Public writing was not something I had anticipated doing much of, as it was the benefit of circumstance. While my writing is not very good, I aimed for an intermediate-level of technical depth that was hopefully not too difficult to follow. It’s hard to say if that was the right call. My one regret is that I didn’t make time to write any introductory articles, or even a technical article or two. 

Above all, I’m sorry to disappoint everyone.

I will, however, write at least one more article for subscribers on one particular topic. I’ve put it off for a long time because it requires much more research than anything I’ve written about to date. You can probably guess the topic.

My thanks to everyone for reading, and hopefully this is not the end.

"Evolution of the img tag: Gif without the GIF"

Everyone loves GIFs, but they're technically awful. This replacement implementation is awesome. The WebKit/Safari team's work has been absolutely amazing in recent years, especially in regards to energy efficiency improvements.

Also:

Aside from not having a formal standard, animated WebP lacks chroma subsampling and wide-gamut support.