How WWDC 21 Foreshadows AR Glasses14 June 2021
Rumours of Apple's AR (or VR) glasses have been picking up over the last 12 months, and in particular, the AR version still feels like a futuristic impossibility to me. How are they going to get all that technology into a sleek pair of glasses?
I've always imagined they would look something like Tim Cook's glasses - allowing for a "and I've been wearing them this entire time" type reveal in a keynote.
Ignoring the hardware leap needed, the other thing I have struggled to understand is how are the glasses going to be used in day to day life? Why are they going to be useful?
There are obvious examples like overlaying the weather when you look up to the sky, but I've struggled to imagine much beyond that. Until WWDC 2021 that is.
When I think about AR glasses, I feel like there are two high-level requirements:
- To transform the inputs (including other devices like the watch) into a context.
- To provide utility based on that context.
With those two requirements in mind, let's take a look at some of the WWDC 21 features that could be a foundational technology for a future AR Device.
Live Text turns any text in an image (or in the visual view of the hypothetical glasses) into digital data that can be processed, categorised and understood.
- Critical to helping understand the wearers' context.
- When abroad, auto-translate features could replace native text with your language to help you understand.
- App overlays (contextual information) could be triggered based on the context of the text. For example, when looking at a recipe, a recipe app can bring up a formatted summary. The list of potential contextual overlays is endless.
Visual Look Up
Visual Look Up is Apple's attempt at taking machine learning image classification to the next level. Whereas previously they could detect a building, dog, or flower in an image. They can now detect the exact landmark, dog breed and type of flower.
- You could choose which map or travel app you want to provide utility for landmark information. It could bring in other metadata like ticket prices, opening times, reviews.
- Object recognition could allow you to see the latest prices or auctions for a Fender Telecaster Guitar, for instance. I see a future where Apps can offer up what 'things' they can provide contextual utility for, and the user picks their preferred app.
- For objects with more specific app needs, then AppClip QR codes would be able to help provide the glasses with more information about what type of app or contextual overlay is relevant.
Alongside visual input, audio is the other key environmental input that can be used to help understand the context.
- Know who is speaking to you, and the ability to isolate that conversation.
- Improved isolation of your voice will be important for controlling glasses with voice. I imagine being able to whisper to them, to avoid the 'talking out loud' social taboo.
- Identifying different types of sounds (vehicles, alarms, animals). This WWDC session explains there are now over 300 categories of sounds built in to iOS that can be recognised by developers, with very little work.
- Spatial audio all the things. In particular, for VR this will help provide a more immersive experience.
- What about isolating the person who is speaking to you in a foreign language, reducing the volume of their voice, and replacing the audio with a real-time translation into your preferred language?
Navigation has always been an obvious use case for AR glasses. The latest features announced this year support this.
- Recognition of buildings combined with accurate maps will help identify the context the wearer is in. Are they inside a cinema currently? Are they walking down a high street full of shops?
- Understanding where they are and what they are doing allows for improvements in navigation, such as the subway exit example Apple showed off.
Notification & Focus
The biggest worry about having a computer strapped onto your eyes all day is information overload, and I can't help but feel the tweaks to notification priorities and Focus are a gentle nod to the future.
- The notification summary would be very useful on the glasses, triggering you to pull out your phone if there is anything of interest.
- Only time sensitive notifications would show on the glasses.
- Focus mode would further reduce the amount of 'information noise' hitting your glasses.
I've pulled out the features and capabilities that I think would be foundational for Apple's upcoming AR glasses, and I've skipped a few obvious ones (ARKit, RealitKit 2, Object Capture, just to name a few). For every feature, service or product Apple announces, think to yourself, could this have come from Research and Development into AR glasses? Or how is this going to be foundational for the feature-set of AR Glasses? You'll start seeing the breadcrumbs everywhere.