The Tech Specs Podcast — Episode 2

For the second Tech Specs Podcast, I'm joined by Josh Ho, whose writing you may be familiar with from AnandTech's mobile coverage. Topics include: four months of new devices, Apple’s SoCs, benchmarks, technology misconceptions, and explaining things on Twitter.

You can download the Tech Specs Podcast on iTunes and Google Play Music.

What happened with the A10X?

New 10.5” and 12.9” iPad Pros were widely anticipated going into this year’s WWDC. It had been 19 months since the launch of the original 12.9” iPad Pro, and nine months since the release of the A10. Big updates were due for the iPad line.

Apple first went through the feature and display improvements of the new models as expected. (I had a pretty good hunch ProMotion would be announced going back to last year.) Then it highlighted the usual specs. CPU performance has increased 30%. GPU performance has increased 40%. This is in comparison to… the A9X. Huh?

This is not what silicon people were expecting to hear. Apple pushes performance like crazy, and it was supposed to be releasing a 10nm SoC. These numbers are not impressive by Apple’s standards. Comparing to the A9X and even older SoCs is deliberate marketing framing to make the numbers sound more impressive than they are.

Could Apple actually have pushed purely for efficiency this time? Given its history of CPU designs, this seemed unlikely, though still possible. Apple also neglected to mention any efficiency advances, which it would likely tout in that case. Most surprisingly of all, Apple made no indirect mention of 10nm. This was deeply suspicious. Given the deliberate vagueness of the performance figures, however, it was hard to make much of the situation.

First, though, I need to apologize for an incredibly stupid error in my thinking about this SoC. I expected an A11X, as it looked like Apple had been working on a 10nm design*, and Apple essentially always pushes performance. If you’re familiar with mobile silicon, it was logical to expect that it was waiting to launch a new SoC on the bleeding-edge node. And if you don’t think Apple would be that aggressive on timing or performance, the A9X’s release was successfully brought forward half a year, a massive accomplishment.

I knew to expect that the A11 would use a 64-bit-only CPU, to allow for a more spatially and performance-efficient core design. I didn’t, however, make the rather obvious connection of dropping 32-bit support in iOS 11 to the near-simultaneous release of Apple’s first CPU without 32-bit support. That is to say, any Apple CPU released before iOS 11 would necessarily need to retain 32-bit registers, and therefore it would make little sense to release a new micro-architecture before the fall. This is because iOS 10.3 still needed to run on the 32-bit A6 and A6X. That the new iPad Pros were seemingly delayed a couple months and released just a few months before the A11 is irrelevant, if unfortunate from Apple’s perspective.

Based on what I know about 10nm, though, there seem to be two likely possibilities. The first is that this was Apple’s plan all along. The A10X would deliberately not push the envelope for whatever reasons. There would be no new CPU microarchitecture, aside from maybe some small improvements. The addition of a third CPU cluster and the necessary logic and tuning to make it all work, however, would be far from trivial. And a lot of work would also go into the new GPU, of course.

The second possibility is that the original 10nm A10X design had to be canceled once it was clear that yields were not going to meet Apple’s targets.

There is a further wrinkle to the picture. If the A10X’s cache sizes really have changed significantly, it would perhaps lend credence to one or the other possibility.

Either way, everything I’ve heard about the A10X points to it still being a 14nm TSMC design, so my best guess is that the original design for the A10X was indeed canceled. This would not be terribly unusual in the world of silicon design, but it would be a rare public setback for Apple’s silicon team. If this is really what happened, the iPad Pro’s intended SoC would have effectively been sacrificed to ensure adequate supply of the A11 for the iPhone 8. For obvious reasons, that would be the correct choice of action.

The bigger implication is that the signs really don’t bode well for 10nm. Final clockspeeds for Snapdragon 835 and Exynos 8895 were really low compared to theoretical expectations. Yields are terrible, and the node is looking as bad as 20nm did. It could even possibly be worse, if TSMC and Samsung Foundry are struggling with issues similar to those that Intel first grappled with during its 14nm ramp-up. Moore’s Law is slowing down because of fundamental physics, not a lack of engineering willpower.

Further details are required, so please consider none of this confirmed. This is merely my best guess as to what happened. Unfortunately the details of these things often never come to light publicly.

And to be absolutely clear: I am not saying that Apple did anything wrong or is trying to be misleading in any way. It still has the fastest mobile CPU by a country mile. 10nm is just not going well.
 

* Discovered by Ashraf Eassa of Motley Fool.

4D Toys

I couldn't not share this. Enjoy.

Expectations for WWDC 2017

Apple has some obvious priorities to address this year at its Worldwide Developers Conference (WWDC). Firstly, it needs to significantly redesign iOS for the upcoming OLED iPhone. There are major technical considerations for both hardware and software that Apple has to deal with for its OLED transition. These considerations extend far past a dark mode for apps, which I would also anticipate. Apple has notably already dealt with OLED before for the Apple Watch. I don’t think it will try to match iOS with watchOS aesthetically, but perhaps their overall appearances will be a little closer. And I highly doubt iOS will change from rendering black text on white backgrounds for maximum readability.

iOS’ dominant white and blue are pretty much the worst colors for an OLED user interface, though, so something more akin to watchOS’ use of black, grey, and green is required (to some degree). Apple could even maybe selectively utilize some red tones. For further context on colors and OLED design considerations for energy efficiency, lifetime, and text legibility, Brandon Chester and I discussed some of these topics at length beginning at 55:03 on the first Tech Specs podcast. I will also publish some further thoughts on the reasoning behind the UI changes at a future time.

Secondly, Apple needs to deliver on deep learning. Despite what Apple says externally, internally executive leadership seems to have been caught off-guard by the sudden, massive progress in AI brought about by the deep learning revolution. Backchannel’s exclusive access piece did not instill confidence, but rather marketing desperation, by trying to conflate all areas of machine learning together as one thing. Ignoring the technical errors in the article, it doesn’t matter that Apple has been using machine learning since the 80s; all that people should care about is its current competitiveness in deep learning. In contrast, when Google mentions machine learning, as far as I know it is always referring to deep learning.

Depending on how you prefer to segment technology, many would argue that deep learning is the most important advance in software since either the touchscreen smartphone's user interface or the internet. In contrast to almost every other technology industry buzzword or phrase of the month, having an “AI-first” strategy is actually credibly meaningful. Last year I argued on Twitter that Apple needed to show a suite of deep learning-powered services, or it was going to look really behind. That is thankfully exactly what it did. At this year’s WWDC, Apple needs to demonstrate a continuing wave of progress company-wide on AI. If you want to get a sense of how seriously Google has invested in internal training on deep learning to remain the market leader, see its alternative exclusive access Backchannel piece from a couple months prior to Apple’s.

These articles are sometimes published months after interviews are granted to journalists, but there was one thing in particular I noted at the time of the Apple piece. Craig Federighi was quoted as saying, “We don’t have a single centralized organization that’s the Temple of ML in Apple.” Not many days before the article was published, it was reported that Apple had acquired Turi, which formed the basis of its new machine learning division, an obvious necessity for internal tooling and research. Perhaps I am reading too much into this, but that might suggest Apple’s deep learning strategy was still in flux at the time of the interview.

It may not seem fair, but this year Apple has to continue to prove it can keep up to some extent with the market leaders. If not, its competitive positioning in AI might come to resemble its mastery of the cloud: perpetually years behind. If it sounds like I am being negative about Apple, believe me I’m not. The company is ridiculously competent at almost everything, but it shouldn’t be graded on a curve on server infrastructure or AI. And by “AI,” think of deep learning algorithms, not futuristic images of omniscient assistants from science fiction.

Thirdly, Apple needs to ship a ton of iPad-specific software improvements. I know these are definitely coming, and I suspect Apple will deliver in spades. The Split View multitasking UI is one example that everyone agrees must be replaced; an icon grid with greater information density could help. Adding drag and drop functionality seems really likely. Using the iPad as a drawing tablet or secondary display for the Mac has been rumored multiple times. And while it would require extensive OS-level engineering to bring about securely, multi-user support also seems like a strong possibility.

Brandon mentioned to me last fall that he thought Apple would transition to a 10.5” iPad display size in 2017 in order to switch to using two regular size classes in portrait mode, which would make sense. I eventually saw supply chain rumors that the new iPad would be 10.5”, so that display size with an iPad mini-sized UI seems pretty likely.

Unsurprisingly, I am hoping that iOS 11 will also provide a major performance revamp for the platform. Don’t expect any miracles, but it would be nice if there's at least less Gaussian blur. There's ample opportunity for Apple to continue to tune how iOS works, to better fit its CPU and GPU architectures and make the most of their microarchitectural advantages.

Apple also clearly needs to showcase a significantly improved Siri. (I’ve always been hesitant to describe digital assistants as “AI.”) I’ll leave hardware rumor reporting to the press, but my guesses would be that the A11X iPad Pros, spec-bumped MacBook and MacBook Pros, and Siri speaker all get announced on Monday. We may or may not see the rumored iOS-wide voice command accessibility this year, but the latter product will probably fully depend on a more capable Siri. My one prediction is that the Siri speaker has nothing to do with mesh Wi-Fi. I think Apple would prefer to ship something useful like an 802.11ad-capable router.

Siri never really worked for me personally, and never understood my voice at all on the Apple Watch, until iOS 10 and watchOS 3 were released. Siri is much better now, but its word error rate is still higher than Google's. Where Apple is definitely market-leading is API design, which is criminally under-appreciated as a competitive advantage. SiriKit is probably the best overall voice assistant API, but it’s also the most ambitious in terms of flexibility. Continued expansion of the deliberately limited API surface is required.

I’m also hoping to see broader deployment of differential privacy and similar experimental technologies. Apple is still going to have to pay the efficiency tax and perform deep learning inference on device with non-ideal hardware, until it can ship more appropriate silicon. My sentiments are likewise the same on security, given last year’s political battle between Apple and the FBI.

I’m not sure if Apple will ever make its secret iOS VR framework public, but if it does, it will probably wait until at least the iPhone 8 announcement. Quality VR basically requires OLED.

Apple needs to continue to make writing functional smartwatch apps much, much easier with watchOS’s API, while still preserving energy efficiency. tvOS deserves a better and more performant multitasking implementation, like the original one. iOS 11 will probably drop 32-bit support, to significantly reduce memory usage. Apple at some point should ship improvements for family management of content and media, especially within the Photos app.

From a developer point of view, I wonder if UITableView might be deprecated. Auto Layout seems to be a performance killer, so some sort of magically more efficient way of arranging layouts would be nice, however difficult to conceive. There is also unending room for improvement for macOS security and the Mac App Store, but one shouldn’t hope for too much.

Lastly, this is the last year that I will hold out hope for a swipe keyboard. It would be an enormous improvement for one-handed use and accessibility. Maybe it could even work in a floating window on the iPad? Please, Apple?

Adaptive-Sync for mobile devices

One of the most exciting display features in recent years has been variable refresh. While gaming monitors have long offered higher refresh rates, NVIDIA pioneered variable refresh PC monitors with G-Sync in 2013. By controlling refresh rates from the display’s end of the display chain, G-Sync significantly improved perceptual smoothness by removing stutter within a certain frame rate range, eliminating tearing entirely, and reducing input lag. Here is a video of the original G-Sync demo explaining its benefits.

AMD soon followed with its competing FreeSync, and VESA then added a free implementation called Adaptive-Sync to the DisplayPort 1.2a specification. Originally, Adaptive-Sync was actually introduced in 2009 for displays using internal video signaling, in the eDP specification which mobile devices use for their integrated displays.

Following the rollout of panel self-refresh, I have been eagerly anticipating the adoption of Adaptive-Sync in mobile display stacks. There hasn’t been a real reason to include it, however, because everything in mobile OSes targets 60fps.

The reason for this is simple: going past 60Hz increases display energy consumption. Driving higher frame rates also requires greater compute and thus further increased power draw. Additionally, there is an important technological distinction between IGZO and LTPS displays, the former generally being limited to larger mobile panels, while the latter’s greater efficiency has led it dominate the market for small panels for smartphones and tablets.

I won’t explain display backplanes here, but IGZO transistors do provide certain advantages, such as faster switching speed due to higher subthreshold swing. IGZO displays thus make higher refresh rates more easily achievable than they are for LTPS panels in mobile and laptops. And although a display stack running at higher refresh rates will burn more power, variable refresh can significantly mitigate this increase.

You may have noticed that the iPad Pros and 2016 MacBook Pros are advertised as supporting variable refresh rates. (This will be a feature of the iPhone 8, too.) This is actually a different implementation than that of PCs and desktop monitors. What Apple is instead doing is driving display refresh down to a constant 30fps when the screen is displaying static content. This provides a significant improvement to battery life.

Over the past several years mobile displays have been produced with increasingly higher peak brightness and wider color gamuts. Where do you go from here? Higher refresh rates.

Ordinarily, whether you are targeting 90 or 120Hz refresh, this would require quite a improvement in system-wide performance. Android and iOS both target 60fps, but neither actually manages to run at a consistent 60fps (and with smooth frame-pacing) pretty much ever, excepting small numbers of extremely performant apps.

iOS used to run quite consistently at 60fps across all of its system apps prior to iOS 7, but then pervasive Gaussian blur, increasingly complex UI layouts, and other factors led to a persistent and significant decline in performance. The iPhone 5 running iOS 6 was the last mobile device to truly run at 60fps, in other words.

Thus, without a major focus on improving system performance, don’t hold you breath for iOS to achieve a consistent 60Hz frame delivery anytime soon. That would require a serious OS-wide code review effort, and it’s not a given that Apple cares enough.

Even if that were to happen, some things are basically impossible. Assume that Apple wants to target a 120fps device refresh rate, a nice even multiple of 60 and 24fps. There is absolutely no chance that it can magically get every first and third party iOS app to render within the 8.3ms render window (1/120 seconds).

As you might have guessed, there is another way — Adaptive-Sync.

To support the feature, the display controller (part of the SoC) must support the feature, and the display driver (DDIC) and the panel itself must be capable of higher refresh rates. I believe now is the time for mobile implementations to finally arrive. A device like a large tablet particularly has the energy capacity to spare, so much so that battery cell volume in the iPad Pros was replaced with larger speaker cavities partially to save weight.

Note that using Adaptive-Sync to avoid dropping frames is not as good as hitting a target frame rate in the first place. Rendering smoothly at 55fps is still inferior to hitting a steady 90fps delivery target, for example, because you are simply seeing fewer frames. But you do achieve the same benefits as with PC implementations: stutter is eliminated, and input lag is reduced. (There is already no tearing on mobile.)

For Android, I think Google could even in theory eventually disable triple buffering should Adaptive-Sync ever become ubiquitous in mobile. Only a frame of latency would be gained back, but it would be a clean win. There is no reason for this to ever happen, though.

In conclusion, I hope to soon see tablets and other devices that support Adaptive-Sync. It’s a killer feature that would make a huge difference for both input responsiveness and motion image quality.

 

Update

I somehow missed that Qualcomm has its own variable refresh implementation called Q-Sync for Snapdragon 835's display controller. Qualcomm didn't mention it to me at CES, so I have no idea about the details. I will try to find out more, but it sounds like it may be a proprietary implementation since it seemingly requires Q-Sync-specific support from display driver vendors. I'm a bit skeptical about adoption. Thanks to Naseer for the heads-up!

Thoughts on Google I/O 2017

There were an enormous number of announcements at this year’s Google I/O. In particular there seemed to be the most advancements in Android development announced since 2014. Rather that attempt to comprehensively summarize the event, here are some scattered thoughts.

Android

The biggest Android-related news was clearly the surprise adoption of Kotlin. The main reason it was a surprise was because of the tone the Android team conveyed in response to requests for Kotlin support at last year’s I/O. While many team members used Kotlin, they seemed to suggest that Java was going to remain the platform language for the foreseeable future.

I’m not a programmer so my opinions on languages are invalid, but everything I’ve ever read about Kotlin makes it seem like a pretty good language. I have no idea how well it performs, though. Because Kotlin was designed for seamless Android interoperability from its inception, it can pretty much immediately replace Java for any developer now that it is officially supported in the Android SDK. There will be some gaps in tooling, but it’s as easy a developer transition as they come. The next step will be transitioning APIs to Kotlin. I don’t think many members of the Android team will miss Java. 

The app lifecycle has always been a nightmare on Android. The new architecture model proposed by the Android team surely has to be an enormous improvement, but it understandably necessitates new classes. The team is notably proposing more of a reactive-style model instead of a model-view-viewmodel (MVVM) paradigm, to quote Yigit Boyar. We’ll see how that turns out.

Android O is a major performance release, probably the most significant one since 5.0 Lollipop. One of the key efforts by the Android team for O was the elimination of binder lock contention. The result is a “massive improvement in jank immediately after boot as services all start up.” Running the O beta on my Nexus 6P, I was amazed how much of an obvious and appreciable difference this makes. For a thorough description of Binder by Dianne Hackborn, see here.

Significant improvements in OS boot time are also claimed, with the Pixel smartphone starting up in half the time on O. App startup times have also been sped up by different optimizations. I suspect all those app racing videos on the internet played a roll in spurring this effort, though I would strongly caution you not to assume those videos are really representative benchmarks.

Also of major note is that Android O features an optional new rendering pipeline, which you can enable from developer settings. Nothing has been said about it, but it is based on Skia. I don’t know anything about graphics libraries, but both Android and Fuchsia have used Skia from day one so I have no idea what the new renderer entails. Perhaps it’s a Vulkan rewrite or a more fully hardware-accelerated pipeline, but I’ve found no information yet online. If anyone knows more, please reach out.

Regarding runtime improvements, ART has switched from using a mark-and-sweep algorithm to a concurrent copying garbage collector. Claimed results are less time spent reclaiming and allocating memory and lower overall usage. I know very little about garbage collection, so I wonder what the tradeoffs are. You should watch the ART presentation if you want to learn more about the new collector and the many other improvements. I do know, however, that Josh's desired simple optimization was unsurprisingly not implemented.

You may be surprised to learn that ART has also at long last added support for automatic vectorization. I won’t explain SIMD utilization here, but I may write an entire article about this topic in the future.

One nice addition that will indirectly improve performance and battery life is that the Play Developer Console will now flag apps that perform poorly in regard to wake locks, wakeups, and jank. Google also said that wake locks will be restricted in future releases of the platform, so developers be warned. These restrictions should be deeply appreciated by users.

Because Android releases are not developed in public, we still know extremely little about Project Treble. Aside from some vague high-level comments, the most specific information given was that "device-specific vendor components" have been sandboxed in some manner. Reducing the bring-up costs of Android updates for higher-end vendors like Qualcomm should be a huge help for the overall ecosystem, but I am skeptical that it will have much impact on a company like MediaTek, which has little financial incentive to provide updates in general. Treble also does not chance the fact that Linux does not have a stable driver ABI. I should point out that it sounds like the transition was a miserable technical slog for many members of the Android team, so thanks to them for their efforts.

With the announcement of Android Go, I immediately wondered if the platform's memory profile had changed. The last time there was a significant increase in RAM requirements for Android was the 5.0 release. There is no Android Compatibility Definition Document for O yet, so it is unclear if the minimum memory requirements will be changing (Section 7.6.1). Based on the ART session, however, overall memory usage should be lower in O.

Much to my surprise, graphics drivers are now updatable through the Play Store. This is not a small detail, and I suspect it was a benefit of Project Treble. Google is also now offering Android developers the equivalent of Apple’s App Slicing. Within security, tamper-resistant secure elements (ala Apple’s “secure enclave”) are now supported in O.

As anyone could predict, deep learning was a huge focus at I/O. Google’s TensorFlow happens to be the most popular deep learning library. While it has been available on Android since launch (and was later made available on iOS), Apple managed to provide a GPU-accelerated mobile framework before Google, with convolutional neural network kernels available in Metal Performance Shaders on iOS 10. The lighter-weight TensorFlow Lite was thus a big (and much needed) announcement, although developers will also at least be able to leverage vendor-specific acceleration libraries through Qualcomm’s Snapdragon Neural Processing Engine SDK. In the near future, TensorFlow Lite will leverage the new Android Neural Network API to allow developers to accelerate their AI algorithms on GPUs or DSPs.

I won’t beat around the bush — the changes in Android 5.0-9.0 have made the platform much more similar to iOS overall. I think the Android team has prioritized the right areas of improvement, even if they’re shipping fewer obvious consumer-facing features.

Everything else

Johnny Lee confirmed that Google’s VPS (Visual Positioning Service) is marketing's branding of the combination of area learning and point clouds stored in the cloud. Standalone Daydream VR headsets use stereo visual odometry and 3D SLAM to provide positional tracking, with drift correction based on area learning. The combination of hardware and algorithms is marketed as WorldSense, which is of course based on a specialized version of Tango.

Google also showed off some amazing VR technology called Seurat, which renders extremely high-fidelity graphics on mobile in real-time via unknown means. The technology could be anything, but it isn’t magic. For similarly impressive demos, check out OTOY’s “eight-dimensional” holographic light field rendering. (Update: Seurat is indeed supposedly some form of light field rendering. This Disney Research paper was released simultaneously.)

Within deep learning, Google stated that it was working on CTC-based sequence models for natural language processing, with “a whole bunch of new implementations" coming soon.

Lastly, I was wrong about the Flutter sessions. There were numerous memes, but none of the animal variety. I apologize for the error.

"Understanding Color"

Romain Guy did not disappoint with his presentation on color at I/O. Color is so complicated that there’s no way to give a succinct introduction to it on Android in 40 minutes, but Romain did about as well as you possibly can.

I would say this is an intermediate level presentation, but if you're interested in learning about color for the first time it will probably feel like an advanced one. Don't worry about all the concepts you don't understand yet. I knew basically nothing two years ago before learning all the material in this talk. Even though color is a massive topic, the individual concepts are relatively easy to grasp. Give it a shot, and you can definitely learn all of the major points over time.

Virtual teardown of a Samsung Galaxy S8

To learn more about the silicon content of a Galaxy S8, I decided to identify many of the components in a US AT&T model. The following list is not intended to be particularly comprehensive. If any of the ICs listed below are outdated, there are likely newer revisions of these components in the phone.

Sensors:
ams TMD4906 optical module — ambient light sensor (ALS) + proximity sensor + IR LED
STMicroelectronics LSM6DSL SiP with accelerometer + gyroscope
STMicroelectronics LPS22HB barometer
Asahi Kasei Microdevices AK09916C magnetometer
Maxim Integrated MAX86907E heart rate monitor
Sealed Sensor Connector (SSC) System - maybe from TE Connectivity?
Semtech SX9320 grip sensor (an “ultra-low power capacitive Specific Absorption Rate (SAR) controller” for detection of RF attenuation by the user’s hand)

MSM8998 SoC:
ARM CoreLink MMU-500 system memory management unit
Qualcomm Adreno Venus 3XX video decoder/encoder (annoyingly marketed as a “VPU”)
Qualcomm WCD9341/9340 Aqstic (Tavil?) audio codec

Miscellaneous:
Broadcom BCM4361 Wi-Fi combo chip in a Murata module (Yes, Broadcom is still making mobile Wi-Fi combos.)
Toshiba THGBF7G9L4LBATRA 64 GB UFS 2.0 NAND + controller
Sony IMX333 and IMX320 BSI CMOS image sensors
Synaptics Namsan fingerprint sensor
Silicon Mitus SM5720 PMIC?
Maxim Integrated MAX98506 digital audio codec (DAC) + headphone amplifier (Hilariously, there are people selling this online as a USB charging controller for some reason.)
NXP PN553 NFC controller
Texas Instruments DRV2624 haptic driver
RichWave RTC6213N single-chip broadcast FM radio tuner
Xilinx XC4000 FPGA? (not sure)
Xilinx XC5000 FPGA? (not sure)
NXP PCAL6524-GPIO GPIO expander
Trustonic Trusted Execution Environment (TEE)
Possibly a Microchip USB controller?
There might be an Xceive XC2028 TV tuner in Korean GS8 models.

Used in system bring-up:
ARM CoreSight STM-500 System Trace Macrocell

Though they’re not terribly useful to publish, here are also some web benchmark results for reference:
JetStream 1.1 (Samsung browser): 75.710 +/- 0.26588
JetStream 1.1 (Chrome): 67.077 +/- 0.56466
Kraken 1.1 (Samsung browser): 2,342.0ms +/- 0.9%
Kraken 1.1 (Chrome): 2,837.6ms +/- 0.5%
Octane 2.0 (Samsung browser): 12,541
Octane 2.0 (Chrome): 11,322
WebXPRT 2015 (Samsung browser): 166 +/- 4
WebXPRT 2015 (Chrome): 158 +/- 3

Note that the beta of the Samsung browser is testing faster at the moment.

If anyone who works with silicon has any corrections, please let me know. I will have more to say on the GS8 in the future. For now I recommend following AnandTech’s technical hardware coverage.

Expectations for Google I/O 2017

The following notes are not intended to be comprehensive or predictive, but are merely some stray thoughts:

I have far fewer expectations this year than in previous years, as most of what I was anticipating was already introduced in the Android O Preview. Namely, this included finally making JobScheduler mandatory, and no longer allowing apps to register broadcast receivers for implicit broadcasts in their manifests.

I’m expecting the following topics to be discussed during the opening keynote: deep learning, deep learning, and deep learning. The machine learning sessions and office hours should be comically over capacity.

Regarding Project Treble, we have basically no details right now. I’m hoping the Android team will elaborate on it, at least during the Fireside Chat.

I was expecting the usual ART optimization improvements, and this year seems fruitful. The new garbage collector should be exciting. Josh Ho also suggested an obvious, albeit minor win to me last year: performing full ahead-of-time (AOT) compilation instead of profile-guided optimization (itself a form of AOT compilation) while on power for apps that have not run yet. The purpose would be strictly to improve first run performance and energy usage. I'm not expecting this to be mentioned, if it was even added.

Yigit Boyar is going to introduce some improvements for Android app architecture. As part of this, the new approach to managing app lifecycles was previously teased and will now be announced in detail.

Romain Guy is going to present on color. You would not believe how complicated the topic is. Color management on Android is still a work in progress, so even though not everything will ship this year, I have faith that everything (such as HDR) will fundamentally be addressed eventually.

I have no idea if anything will be announced at I/O, but at some point UHD content will be launched on Google Play. I’ve seen that the Chromium team is working on HDR support, which will be ready before color management is added to the browser. We may thus see HDR content limited to sRGB before UHD content is launched. I expect Google to push Dolby Vision. Either way, I don’t expect any Android devices will fully support HDR this year.

For anyone interested in Fuchsia, there are two talks on Flutter. I expect at least 50% of the slides to feature animal memes of some kind. Wise developers should embrace Flutter as the future of mobile development on Google’s platforms.

Lastly, I’m personally curious to learn more Android Things.

Initial thoughts on the design of the Surface Laptop

Yesterday Microsoft held its Education event, where it unveiled Windows 10 S and the Surface Laptop. While I will have more to say another time, I first want to discuss the intriguing new hardware.

With the new Surface Laptop, Microsoft seems to have substantially improved the overall quality of its hardware. I would argue that its hardware to date hasn’t been truly competitive at the high end of the market, but I would still recommend Microsoft’s devices over those from every other Windows OEM due to their superior hardware-software integration. Based on my experience with the Surface Pro 4, you are going to get a much smoother, more polished experience for criteria like trackpad performance by running software tailored by Microsoft.

The company is comparing its new laptop directly to the 13" MacBook Pro, particularly emphasizing how the Laptop weighs 0.26 pounds less than the Pro. Part of the weight difference is due to the Laptop's Alcantara surface, which I find to be the most interesting engineering decision. This material choice trades off structural rigidity and thermal dissipation efficiency for lower weight and greater comfort.

It is critical to note, though, that the Laptop only offers 15W U-series Core CPUs from Intel, while the 13" MacBook Pro also offers 28W CPUs for its more expensive configurations. In other words, the Surface Laptop has been aimed at a lower TDP, and thus lower performance, target than the 13" Pro. An eventual 15” Surface Laptop with H-series CPUs now seems likely, and many would be excited by such a product. Microsoft’s concession to its OEM partners is that it is once again only competing at the very high end of the market.

First, the bad news. The Laptop features one “full-size” USB Type-A port and one Mini DisplayPort, but no Type-C ports. At this point, Microsoft’s affinity for legacy ports and eschewing of any and all progress in connector standards is comical. Enterprise usage isn’t even a real concern, so there’s really no excuse.

I also strongly recommend not buying the base configuration with only 4GB of RAM. That makes the real starting price $1,299, in my opinion.

To be frank, there have been plenty of issues with Microsoft's hardware in the past. While it hasn't skimped on battery capacity in its recent products, the company has never shipped particularly spatially efficient computers with its Surface line. For the Surface tablets this philosophy went so far as to deliberately utilize empty space internally in order to optimize weight distribution, a design decision I found bizarre.

Previously questionable design tenets seem to have been abandoned with the Surface Laptop, however, and I think Microsoft's willingness to somewhat normalize on design has resulted in its most compelling device to date, at least on paper. And as it acknowledged on stage, that's exactly what its fans have wanted it do - just make a normal laptop.

In the grand tradition of PC vendors harmlessly creating new marketing terms for laptop audio, Microsoft has branded its speaker implementation as Omnisonics, not to be confused with ambisonics. Since it could not cut holes into the Alcantara surface, it has instead placed its two speakers behind the keyboard, radiating sound out through the keycap edges. The result is going to be mediocre audio quality from the speakers. On the plus side, the Surface Laptop uses Dolby Audio Premium DSP algorithms and meets whatever hardware requirements the license entails.

Since the Laptop is not a tablet, I can’t think of a single good reason why the aspect ratio of the display should be 3:2 instead of 16:9 or 16:10. Panos Panay stated that Microsoft wanted to maximize display area (optimizing closer to 1:1), but this strikes me as absurd for a product that never changes orientation. I’m particularly not a fan of the aspect ratio because it makes the laptop lid more likely to wobble. (This is one of the main reasons the Surface Book form factor did not work.)

Microsoft ships the most accurate displays of any Windows OEM, but while its panels have been very bright, they haven’t been competitive on efficiency. (Seriously, please never source Panasonic panels.) Based on the rest of the system design, though, I’m hopeful that the Laptop features an efficient display. Although there are not yet many Win32 apps in the Windows Store, keep in mind that there is effectively no High DPI support for Win32 apps.

Now for the good news. One of the best things about the Surface Laptop is that Microsoft has learned from the Surface Studio color calibration debacle. Unlike with the Studio, the Laptop’s display is correctly calibrated to sRGB, because there is no system-wide color management in Windows. It’s great to see Microsoft improving like this. The company continues to individually calibrate its displays, a practice going back to the Surface 3 which is a big deal and should result in great color and greyscale accuracy.

Microsoft didn’t exactly ship Kaby Lake-U early, but it can tout it as a significant advantage over Apple, as Kaby Lake provides the rough equivalent of a 300MHz CPU frequency boost over Skylake. The one battery life claim made, up to 14.5 hours of video playback, also emphasizes the advantage of Kaby Lake's addition of HEVC hardware decode, which Skylake was sorely lacking.

The entire rest of the laptop basically looks great, as do the four available colors. One thing that many people have missed is that, though, is that the GPU and DRAM frequencies for the priciest SKUs are lower than those for the 13" MacBook Pro.

(Sidenote to Microsoft: please make the technical specifications listed in your Fact Sheets more directly accessible to prospective buyers. They are important. Compare to Apple. I would also criticize Apple’s minimal consumer disclosure, mind you.)

The combination of lower clock speeds, the Alcantara, and the size of its singular fan makes me somewhat skeptical about the energy and thermal efficiency of the case design. I would expect conservative DVFS tuning. Public testing will have to wait on a review by AnandTech. Panay did weirdly seem to suggest that the keyboard feels warm during normal use.

Even though much of this article has been criticism and concerns, overall I have a very positive impression of the product. Microsoft is clearly on a roll, and the Surface Laptop appears to be its best hardware to date. While the device is particularly aimed at college students, especially since MacBooks have traditionally done well at US universities, I think it will sell well to a broad audience.

LG Display rumored to be investing in mobile OLED production

The Electronic Times recently reported that Google wants to invest ~$880 million in LG Display for future production of OLED displays. If this rumor is true, I suspect the potential strategic investment would not just be for securing displays for future Pixel devices, but for helping LG Display to seriously re-enter the mobile OLED market. The company previously sold flexible OLEDs to LG Electronics for its G Flex and G Flex 2 smartphones in 2013 and 2015, respectively, but I am not aware of any other smartphones ever using LGD OLED.

To date Samsung Display has been far ahead of everyone else in mobile OLED due to its vastly greater investment in the segment. LG Display could probably make reasonable OLED displays, though, if it had the financial incentive to make major investments in smaller panels. It has already proven its OLED capabilities with its Apple Watch displays, however difficult they were to make, and of course its leading W-OLED panels for the TV market.

This rumored change in strategy would probably be more about industrial design demands than display quality considerations. Several vendors, including Google, want to be able to compete with Samsung and Apple’s (upcoming) bleeding-edge smartphones that strongly associate OLED displays with high-end industrial design. While OLED is not necessary at all for creating a design with minimal bezels, some or even most of these vendors likely require OLED because they want to bring curved displays to market, sacrificing some image quality in the process.

To be clear, there are many advantages (and some disadvantages) to working with OLED displays over traditional LCDs from an industrial design point of view, which I won't fully enumerate here. One of the major differences, while it sounds obvious, is that OLEDs do not have LCMs (liquid crystal modules).

No matter what, it's not at all clear that investing in OLED over LCD long term would be a smart move, and most display suppliers remain skeptical of the former. OLED is better overall now, but it has strong downsides in terms of lifetime, costs (due to lower yields), and various quality deficiencies such as severe off-angle color shifting and chromatic aliasing. LCD meanwhile constitutes the lion's share of the market. microLED won't come to market for years, but it has greater potential than OLED should its production become economically feasible.

If vendors bothered to pay for high quality displays, we would see smartphones other than those from Samsung with correctly calibrated OLEDs with leading-edge quality. Perhaps a vendor or two other than Apple may one day do that. For now, given Samsung Display’s massive lead I remain skeptical that anyone can compete with it on quality over the next few years.

"Samsung introduces HDR10+ format to combat Dolby Vision"

Samsung: "Would you like another hole in your head?"
Consumer: "Yes, Samsung. Yes, I would."

We weren't necessarily going to have an HDR format war. But Samsung pushing yet another HDR standard could possibly precipitate one.

The only party that benefits from HDR10+ is Samsung.

Google retires Octane

A couple weeks ago I had caught wind of some web benchmark being killed. The first thing I did was check Octane, but the test harness was still unchanged.

Yesterday Google announced that Octane has been retired. It claims the deprecation is due to Octane being over-optimized against for years, sometimes to the detriment of real world application performance. You can still run the benchmark, but the page notes that it is no longer being maintained.

This looks really bad. Octane was far from the worst benchmark, and it had been optimized against for years by pretty much everyone anyway. It did not suddenly become outdated overnight. (This does not in any way mean that benchmarks are somehow useless or unnecessary.)

If Google is working on a new browser benchmark, great. But it's hard to believe that Google didn't actually kill Octane because Edge now beats Chrome on it, which Microsoft immediately promotes upon launching Edge in the Creators Update for Windows 10.

Why the new iPads are delayed

After Apple's rather surprising admission of fault yesterday about not updating the Mac Pro, I would like to address another area where the company happens to be blame-free. By now I have read every possible conspiracy theory under the sun about why Apple hasn't shipped [insert_your_desired_device_here]. Some of the most recent speculation is that Apple didn't announce new high-end iPads because some upcoming iPad-specific software is not ready yet. This is probably nonsense.

Anyone who follows mobile silicon knows how simple the current situation is in all likelihood: the iPads are delayed because Apple can't ship the A11X (Fusion) in sufficient volumes yet to its desired quality metrics (final clockspeeds, et al.). More succintly, Apple has not yet introduced an A11X iPad because 10nm is a bit of a disaster. And despite what you may read, a 10nm tablet SoC would be an A11X, not an A10X.

Everything I have heard points to both Samsung Foundry and TSMC suffering very poor yields currently, in the realm of 30-40%. 10nm is just a shrink node, but it turns out that shrinking transistors is excruciatingly challenging these days because of pesky physics. And if 10nm ends up being an outright bad node, we've seen this leaky transistor nightmare before.

If you're not familiar, 10nm from Samsung Foundry and TSMC is not at all the same as Intel's 10nm. Their 10nm is actually very comparable to Intel's 14nm, with nearly equivalent density. All node names are marketing nonsense these days anyway. 14nm was particularly egregious given the reuse of 20nm's BEOL, and TSMC didn't even call it 14nm simply because "four" sounds like "death" in Mandarin; "16nm" doesn't really exist.

Internal delays are still real delays. With yesterday as an extreme exception, Apple doesn't like to talk about products until just before they're ready to ship. When it does talk in advance, even off-the-record, things can go wrong, and forward-looking statements can go unfulfilled. It doesn't suffer a negative marketing impact by keeping internal delays internal, but much more importantly it also doesn't realize the greater profits it would have if it had been able to ship on time. The A11X delay hurts its bottom line.

The situation really is probably that simple. It's not that Apple suddenly feels like it can and should wait longer between iPad refreshes (19 months now for the 12.9" iPad Pro). And despite Intel's newly constant delays, it is often not actually to blame for Your Theoretical New Mac of Choice not being released. This is a broader topic I may address another time.

On Tizen and buffer overflows

"'It may be the worst code I've ever seen,' he told Motherboard in advance of a talk about his research that he is scheduled to deliver at Kaspersky Lab's Security Analyst Summit on the island of St. Maarten on Monday. 'Everything you can do wrong there, they do it. You can see that nobody with any understanding of security looked at this code or wrote it. It's like taking an undergraduate and letting him program your software.'"

Eh, it's Tizen. I already expected this.

"One example he cites is the use of strcpy() in Tizen. 'Strcpy()' is a function for replicating data in memory. But there's a basic flaw in it whereby it fails to check if there is enough space to write the data, which can create a buffer overrun condition that attackers can exploit. A buffer overrun occurs when the space to which data is being written is too small for the data, causing the data to write to adjacent areas of memory. Neiderman says no programmers use this function today because it's flawed, yet the Samsung coders 'are using it everywhere.'"

...

Sometimes reblogging takes the form of a desperate prayer that people will finally care about how unbelievably bad things are.

ARM's big announcements

While ARM's big.LITTLE has evolved from its initial simplistic CPU migration and cluster migration iterations to simultaneous multi-processing (global task scheduling) to energy aware scheduling, the strict segmentation between big and LITTLE clusters has remained non-ideal. For example, the efficiency of shared memory access among CPUs and the speed of task migration have been significant downsides. Yesterday ARM announced the future of multi-core CPU design for its IP ecosystem, a series of technologies collectively branded DynamIQ which has major implications for SoC design.

I believe the reason DynamIQ took so long to announce, and was probably a ton of work to bring about, was how many interconnected systems were required to be redesigned in concert. And it was probably harder still to get them all working together well and efficiently. New interconnect designs and features are necessary to make the newly possible cluster designs work, and there is a new memory subsystem design for which no details are yet being provided. IP blocks such as accelerators can also now be plugged into these fabrics through a new dedicated low-latency port.

One of the biggest takeaways is that it will finally be possible to combine different core designs together within a cluster. If all of the enabling cache design and scheduling are well-implemented, such CPU designs could theoretically realize significant performance and efficiency improvements. Hopefully a DynamIQ system will not be too much harder to implement for silicon vendors, but I wouldn't assume it will be easy. ARM will really have to make it workable with its stock IP at the very least.

It's hard to say much more about DynamIQ, as ARM is still holding back most of the important details, which I would not really be qualified talk about anyway. There are other announcements such as new CPU instructions for deep learning, which I personally care less about but are still very important for many designs, such as IoT systems without a DSP or GPU. Since Cortex-A53's successor is likely coming at some point, depending on ARM's design goals, I wonder if the first DynamIQ systems will be based entirely on that new core.