LeapMotion and Chill: Netflix Gesture Control

Back for another installment of HCI Assignments 2018 (if you have any other ideas for what I should call these, let me know). If you'd like to read about my comparably more difficult chat bot journey from the last assignment, check it out here. If not, no hard feelings, I got another one ready for ya.

The Assignment: do a conceptual design and early user testing of the Leap Motion Controller as a non-VR, non-sign-language input device for an existing desktop or mobile app.

As a primer, check out this video from the Leap Motion creators showing the various ways it could be implemented.

Pretty cool right? Yeah, I think so, too.

According to its creators, the Leap Motion Controller is "the most advanced hand tracking" device on Earth, using infrared LEDs and an expertly developed software to create a magical experience where your hands directly control your device. Not only is it paving the way for future virtual and augmented reality, but it opens a door to a whole space of interaction and experience designs.

Before receiving the assignment, Professor Li brought his own Leap Motion Controller into class for us to play around with. This was a great opportunity for us to interact with the device and learn both its capabilities and its limitations. I naturally get overexcited by new technologies, but I definitely noticed some of the limitations of the controller. The intended position is lying down on a flat horizontal surface and, because of this, it actually has trouble with the central axis perpendicular to the surface it is sitting on. For example, the device had trouble detecting the two hands as separate when they overlapped and couldn't tell the difference between thumb up and thumb down when the hand was held perpendicular to it. In addition, some of the smaller gestures, like pinch-in or pinch-out, have to be performed at an angle where the finger motions, and even the fingers themselves could be distinguished. Performing a 'pinch' motion perpendicular to the device (moving your fingers apart vertically) would make it difficult to detect the gesture. After noting these limitations, we proceeded to tackle the assignment.

As with any design problem, my partner Kevin and I started by defining the problem we were trying to solve. Combining Professor Li's guidelines with our interaction with the device, we came up with some general focus points:

  1. The app's functions/layout should be ideal for gesture integration (i.e. it should make sense for gestures to be used with the app)
  2. The gesture designs should complement use of the desktop/mobile app, not make it harder
  3. The gestures should be appropriate for the task and easy to understand/intuitive
  4. The gestures should be appropriate for interaction with the Leap Motion Controller
We also felt that we needed to focus on who we were designing for, but it felt like we were just translating functions that were already designed for everyone into something gesture-based. Unlike the past assignments, this one challenged us to design intuitive gestures to trigger existing functions, rather than redesigning it to work for a specific target demographic. If anything, we decided to make sure that the gestures were the most intuitive to make sure that everyone would be able to use them, if they get the chance to.

With our sights aligned, we began the ideation process, starting with potential apps. Our initial thoughts brought us to Maps (Google or Apple) or Tinder. These were definitely viable options as their interfaces afford gestural integration, however, I pushed to come up with apps that challenged us to go beyond the general zoom, swipe and rotate gestures. That's where Netflix comes into the picture.


Screenshot of the interface for the Windows 10 Netflix app.

Netflix's interface in general is great for gestural integration, giving us lots of space for gestural design creativity. On my laptop, for instance, scrolling is mapped to two finger swipe gestures, but there are other functions that aren't gesture based that very well could be, like video controls.

With the app chosen. we played around with the interface and decided which functions were worth turning into gestures as well as the gestures that should trigger that function. Some o the functions, like captions and settings that we couldn't immediately think of an intuitive gesture for, so we set those aside while we tackled the more intuitive functions. After about an hour, we compiled the following list of function-gesture combinations:

Main Menu

  • Swipe left and right - scroll horizontally through a row of titles (i.e. through titles under "New Releases")
  • Swipe up and down - scroll vertically through the sections of titles
  • Pinch out (Zoom in)/Pinch in (Zoom out) - make a selection/exit a selection
  • "Gun shot" - play selection
  • Draw search icon - open search (an additional interface would appear to allow the user to easily enter search queries; I pictured something similar to the WrisText interface. Since we don't have the time/resources to create the interface, we won't be able to test the actual usability of this feature.)
Video Control
  • Clockwise/Counterclockwise circle gesture - fast forward/rewind
  • "Stop motion" (Palm out) - pause video
  • "Gun Shot" - play video
  • Turning motion with three fingers (like turning a knob) - change volume
I kinda wish we videotaped our ideation process because as we moved from function to function, there was a lot of gesturing happening in tandem with our internal ideation and external conversation as we tried to find intuitive gestures that everyone would understand. The easiest gestures to design were the ones that were already culturally accepted elsewhere. The scroll functions are a cultural norm in the age of touchscreens and the stop motion is a cultural signal for "stop."  The others, though not as straightforward, felt intuitive to us. For example, the pinch gestures are reminiscent of zooming in to get a closer look at something and then zooming out when you are done looking at it. The circle gestures were based on the rewind arrow in the video interface. It is an arrow going around a circle pointing in the counterclockwise direction, hence our decision for the gesture. Since fast-forward is the opposite of rewind, we figured the gesture for fast-forward should be the opposite, or a circle gesture going clockwise. The volume button was skeuomorphic in nature in that we pictured the user turning a dial as they would on a radio or old school television. Lastly, the "gun shot" gesture was a bit out of focus and was chosen as a more fun and creative idea than an intuitive one. We ditched the idea in later iterations.

After creating this initial list, we were ready to perform the first user test. For all of the user tests, we used a USB wireless mouse to simulate the users controlling the interface with their gestures (a.k.a Wizard-of-Oz-ing). I also used my phone as a visual for the controller and to mark where the users should perform their gestures. For those who were unfamiliar with the device, we showed them the intro video from the beginning of this post to familiarize them with the device.

Out friend Leo, another avid user of Netflix who is very familiar with the app's interface, agreed to help us out for the first test. We understood that what seemed intuitive for us may not actually be as intuitive for other people, so we decided to use this opportunity to learn more about other people's intuitions. Instead of giving Leo our list of gestures to perform, we asked him to perform a lists of tasks (i.e. Select a title in "New Releases") with whatever gesture he felt would perform that task. Because we didn't have the actual Leap Motion Controller, we used a wireless mouse to simulate him navigating the interface as he performed the gestures (a.k.a Wizard-of-Oz-ing). In addition, for all the user tests, I used my phone as a visual for the users to know where they should perform the gestures. Here is a video of this user test:



We had trouble getting the app to work, so we used the web version, however, the two are quite similar and we were more interested in the thoughts behind his gestures. After the test, we noticed some differences between our gestures and the ones Leo performed.

  1. To select a video, Leo put his hand out and 'guided' the cursor.. When the cursor was above the title, he pushed his hand forward as if to 'press' on the selection. The idea of guiding the cursor is something we just didn't consider at all in our original design.
  2. While watching a show, Leo performed the stop motion to pause the video and then once again to play it. As I noted earlier, the "gun shot" gesture was more of a playful idea and it became very clear that it was an unintuitive design choice. 
  3. To control the volume, Leo put his hand flat above my phone and raised/lowered his hand to increase/decrease volume instead of turning a dial
This test revealed some valuable information that we didn't really consider before. Not all of the functions need gestures. For example, instead of making a gesture for settings and captions, we could just allow the user to guide the cursor to those menus and push forward to select them. After selecting, a user should be able to exit a selection, so we decided to make "pull back" the gesture for "exit selection." This made sense to us since the controls for Netflix seemed to come in pairs (i.e. play/pause, select/exit, fast-forward/rewind). This lead to the second realization: some of the gestures could be combined, which would reduce the amount of gestures that the user would have to remember. While a show/movie is playing, performing the stop motion will pause the video and then performing it a second time will resume the video. Further, the gestures should be as simple as possible. The volume button could be as simple as raising and lowering the hand rather than something more intricate like turning a dial. 

This last point speaks to our 4th focus stating the gestures should be appropriate for interaction with the Leap Motion Controller. I wasn't able to test our volume gesture with the controller itself, but I suspect that the device would have a hard time detecting it. A lot of the dial shaped volume controls we are used to seeing are vertically set, so I assume that the mental model of volume dials matches this image.If this is true, then users would have trouble with the gesture because the thumb would most likely be blocking the other fingers from view, making it hard for the device to even detect the initial shape of the hand, let alone the motion it is making. 

Here we come to the second iteration of our gesture design:

 Main Menu 

  • Swipe left and right - scroll horizontally through a row of titles (i.e. through titles under "New Releases")
  • Swipe up and down - scroll vertically through the sections of titles
  • Push forward/Pull back - make a selection/exit a selection or video
  • Draw search icon - open search (an additional interface would appear to allow the user to easily enter search queries, something similar to the WrisText interface)
Video Control
  • Clockwise/Counterclockwise circle gesture - fast forward/rewind
  • "Stop motion" (Palm towards the screen) - play/pause video
  • Raise/lower hand - increase/decrease volume
After tweaking the design, we found another user to test: my friend Phoebe. She is a senior who is also a chronic Netflix user. She was unfamiliar with the Leap Motion device, so we showed her the Introduction to Leap Motion video. In this test, we actually wanted to see how intuitive our design was, so we gave Phoebe some time to look over the controls and familiarize herself with them. Then, we asked her to perform a series of tasks, again, simulating control using the USB wireless mouse. Here is a video of her interaction:



We were quite happy with the outcome of this user test. The only thing I really found interesting was that instead of "pulling back" with a flat hand, she used an upright hand (sort of the stop signal) while pulling back to exit the selection. Otherwise, she had no trouble performing the gestures to navigate the interface.

No changes were made after Phoebe, but we wanted to test it one more time to make sure we didn't miss anything. The lovely computer whiz, William, walked into the room a few seconds after Phoebe
had finished and graciously donated his time to our cause. We prepared him the same way we did for Phoebe and asked him to perform the same tasks. Here is his interaction with the interface:



William took a second to learn the interface. Instead of pulling back to exit, he guided the cursor to the 'X' to close the selection. However, not too long after, he started to perform the designed gesture. When he got to the search function, he also performed our gesture, but he posed a great question at the end: why is that necessary, if you could just move the cursor to select it? He had a point. If done haphazardly, drawing the icon could be registered as simple swipe motions. Because of this, I decided to remove that gesture altogether, but kept the idea that a separate interface would come up to input letters so that the flow of using the app with gestures is not interrupted.

Taking into consideration the final two user tests, here is the final iteration of our gesture concept for Netflix:



 Main Menu 

  • Swipe left and right - scroll horizontally through a row of titles (i.e. through titles under "New Releases")
  • Swipe up and down - scroll vertically through the sections of titles
  • Push forward/Pull back - make a selection/exit a selection or video
  • When search is opened,an additional interface would appear to allow the user to easily enter search queries, something like the WrisText interface)
Video Control
  • Clockwise/Counterclockwise circle gesture - fast forward/rewind
  • "Stop motion" (Palm towards the screen) - play/pause video
  • Raise/lower hand - increase/decrease volume
The latter two user tests also showed that the gestures are learn-able and intuitive. Both William and Phoebe performed the gestures with general ease and with little confusion about what they should do. William quickly picked up on the pull back gesture and I bet, given more time to use the system, Phoebe would have learned that the "correct" way to perform the gesture is with a flat hand. Our goal with this design was to design a system of gestures that worked with Netflix and worked for all of its users. The only feature we didn't test was the WrisText-esque typing interface, but the rest of the gestures appear to hit the mark. They are all fairly simple allowing users of all walks of life to use them and there are very few to learn allowing the users to easily learn the system.

The only thing that appeared in all of the user tests that seemed troublesome is the control over how much the fast-forward gesture sent the user. The rewind default is 10 seconds, but there is no set increment for the fast-forward feature. If we actually implemented this, I would set the fast-forward to something like 30 seconds or give the user the ability to scroll through the episode's/movie's scenes.

Overall, the assignment was pretty interesting. It allowed us to challenge our assumptions about intuition with gestures and to consider the various ways in which hand gestures could be implemented to control our electronic devices. I am fairly happy with the outcome of the gesture designs and I am excited to see where the future of gesture interactions takes us!







Comments