Exploring UI Fundamentals for visionOS

Last updated:
Jul 4, 2023

Draggable Elements

There is no hover state on visionOS.
And while there has never been hover states on mobile either, there’s a difference this time.
Because the input method in visionOS - eye tracking - does physically allow for hover states, since the device knows about your selection (where you’re looking) before performing the action. Apple just decided not to communicate your eye position to apps & websites for privacy reasons.
But the OS has that information.
This is interesting, because it means that apps & websites can’t design their own hover states, which the app could use to indicate what interactions are possible for the element you’re hovering over. But the system can automatically apply a system style to it, which they’re doing with a subtle glow.
And since the system knows not just what you’re looking at but also if the element you’re looking at is clickable, or scrollable, or draggable, etc, there’s an opportunity here to define how the system UI could visually indicate the type of possible interaction of the object you’re looking at.
The equivalent of macOS cursor styles for a gaze-based input method.
On macOS, the cursor can change its appearance to communicate the type of possible interaction of the object behind it, for example dragging. Since there is no traditional cursor on visionOS, the only other place to display such an indication is the affected element itself.
Here’s how this could look like.

--- video concept here ---

To indicate that a UI element is draggable, the system could display a drag handle, similar to the one below every window in visionOS, below the element you’re looking at.
Once you look directly at this drag handle, and tap and hold your fingers, the element would now move with your eyes, which act as the cursor. Once you spread your fingers again, the element would be placed in the new position.
This UI would automatically be applied by the system to every draggable element you're looking at.
In addition to using the drag handle, elements should most likely also allow the user to drag them around by looking at and dragging the element itself. In the future, this might even be the more common method. But I can imagine that a new input method, as the one in visionOS, might have to help users get accustomed with certain patterns by displaying dedicated controls.

Text Input

In visionOS, users can look at an input field and start talking to enter text. Based on Apples presentation, it looks like like this will always replace all text that might have previously been inside the input field.
I wanted to explore if it would be possible to allow audio input wherever the caret (blinking line) is currently focussing, so that it could also be used for composing longer pieces of text and potentially switching back and forth between keyboard and audio input on the fly.
Here’s how that could work.

--- video concept here ---

What do you think? Let's discuss!

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Originally Published:
Jul 4, 2023