Building UIs in Figma with hand movements
Charlie Gerard

Charlie Gerard @devdevcharlie

About: I am a senior developer advocate, passionate about creative coding and building interactive prototypes mixing science, art & technology. I also spend time mentoring, contributing to OSS and speaking.

Location:
Seattle
Joined:
Jun 23, 2017

Building UIs in Figma with hand movements

Publish Date: Jan 20 '22
310 13

Post originally shared on my blog.

Since the release of the latest version of the MediaPipe handpose detection machine learning model that allows the detection of multiple hands, I've had in mind to try to use it to create UIs, and here's the result of a quick prototype built in a few hours!

Before starting this, I also came across 2 projects mixing TensorFlow.js and Figma, one by Anthony DiSpezio to turn gestures into emojis and one by Siddharth Ahuja to move Figma's canvas with hand gestures.

I had never made a Figma plugin before but decided to look into it to see if I could build one to design UIs using hand movements.

The first thing to know is that you can't test your plugins in the web version so you need to install the Desktop version while you're developing.

Then, even though you have access to some Web APIs in a plugin, access to the camera and microphone isn't allowed, for security reasons, so I had to figure out how to send the hand data to the plugin.

The way I went about it is using Socket.io to run a separate web app that handles the hand detection and send specific events to my Figma plugin via websockets.

Here's a quick visualization of the architecture:

A web app is running the hand detection with TensorFlow.js, sends events such as pinch or zoom to the express server with socket.io running on port 8080. The Figma plugin is listening on the same port to the zoom event and when received, triggers different actions in the Figma UI.

Gesture detection with TensorFlow.js

In my separate web app, I am running TensorFlow.js and the hand pose detection model to get the coordinates of my hands and fingers on the screen and create some custom gestures.

Without going into too much details, here's a code sample for the "zoom" gesture:

let leftThumbTip,
    rightThumbTip,
    leftIndexTip,
    rightIndexTip,
    leftIndexFingerDip,
    rightIndexFingerDip,
    rightMiddleFingerDip,
    rightRingFingerDip,
    rightMiddleFingerTip,
    leftMiddleFingerTip,
    leftMiddleFingerDip,
    leftRingFingerTip,
    leftRingFingerDip,
    rightRingFingerTip;

if (hands && hands.length > 0) {
    hands.map((hand) => {
      if (hand.handedness === "Left") {
        //---------------
        // DETECT PALM
        //---------------
        leftMiddleFingerTip = hand.keypoints.find(
          (p) => p.name === "middle_finger_tip"
        );
        leftRingFingerTip = hand.keypoints.find(
          (p) => p.name === "ring_finger_tip"
        );
        leftIndexFingerDip = hand.keypoints.find(
          (p) => p.name === "index_finger_dip"
        );
        leftMiddleFingerDip = hand.keypoints.find(
          (p) => p.name === "middle_finger_dip"
        );
        leftRingFingerDip = hand.keypoints.find(
          (p) => p.name === "ring_finger_dip"
        );

        if (
          leftIndexTip.y < leftIndexFingerDip.y &&
          leftMiddleFingerTip.y < leftMiddleFingerDip.y &&
          leftRingFingerTip.y < leftRingFingerDip.y
        ) {
          palmLeft = true;
        } else {
          palmLeft = false;
        }
      } else {

        //---------------
        // DETECT PALM
        //---------------
        rightMiddleFingerTip = hand.keypoints.find(
          (p) => p.name === "middle_finger_tip"
        );
        rightRingFingerTip = hand.keypoints.find(
          (p) => p.name === "ring_finger_tip"
        );
        rightIndexFingerDip = hand.keypoints.find(
          (p) => p.name === "index_finger_dip"
        );
        rightMiddleFingerDip = hand.keypoints.find(
          (p) => p.name === "middle_finger_dip"
        );
        rightRingFingerDip = hand.keypoints.find(
          (p) => p.name === "ring_finger_dip"
        );

        if (
          rightIndexTip.y < rightIndexFingerDip.y &&
          rightMiddleFingerTip.y < rightMiddleFingerDip.y &&
          rightRingFingerTip.y < rightRingFingerDip.y
        ) {
          palmRight = true;
        } else {
          palmRight = false;
        }

        if (palmRight && palmLeft) {
          // zoom
          socket.emit("zoom", rightMiddleFingerTip.x - leftMiddleFingerTip.x);
        }
      }
    });
  }
}
Enter fullscreen mode Exit fullscreen mode

This code looks a bit messy but that's intended. The goal was to validate the hypothesis that this solution would work before spending some time improving it.

What I did in this sample was checking that the y coordinate of the tips of my index, middle finger and ring finger was smaller than the y coordinate of their dip cause it would mean my fingers are straight so I'm doing some kind of "palm" gesture.
Once it is detected, I'm emitting a "zoom" event and sending the difference in x coordinate between my right middle finger and left middle finger to represent some kind of width.

Express server with socket.io

The server side uses express to serve my front-end files and socket.io to receive and emit messages.

Here's a code sample of the server listening for the zoom event and emitting it to other applications.

const express = require("express");
const app = express();
const http = require("http");
const server = http.createServer(app);
const { Server } = require("socket.io");
const io = new Server(server);

app.use("/", express.static("public"));

io.on("connection", (socket) => {
  console.log("a user connected");

  socket.on("zoom", (e) => {
    io.emit("zoom", e);
  });
});

server.listen(8080, () => {
  console.log("listening on *:8080");
});
Enter fullscreen mode Exit fullscreen mode

Figma plugin

On the Figma side, there's two parts. A ui.html file is usually responsible for showing the UI of the plugin and a code.js file is reponsible for the logic.
My html file starts the socket connection by listening to the same port as the one used in my Express server and sends the events to my JavaScript file.

For example, here's a sample to implement the "Zoom" functionality:

In ui.html:

<script src="https://cdnjs.cloudflare.com/ajax/libs/socket.io/4.4.1/socket.io.js"></script>
<script>
  var socket = io("ws://localhost:8080", { transports: ["websocket"] });
</script>

<script>
  // Zoom zoom
  socket.on("zoom", (msg) => {
    parent.postMessage({ pluginMessage: { type: "zoom", msg } }, "*");
  });
</script>
Enter fullscreen mode Exit fullscreen mode

In code.js:

figma.showUI(__html__);
figma.ui.hide();

figma.ui.onmessage = (msg) => {
  // Messages sent from ui.html
  if (msg.type === "zoom") {
    const normalizedZoom = normalize(msg.msg, 1200, 0);
    figma.viewport.zoom = normalizedZoom;
  }
};
const normalize = (val, max, min) =>
  Math.max(0, Math.min(1, (val - min) / (max - min)));
Enter fullscreen mode Exit fullscreen mode

According to the Figma docs, the zoom level needs to be a number between 0 and 1, so I am normalizing the coordinates I get from the hand detection app to be a value between 0 and 1.

So as I move my hands closer or further apart, I am zooming in or out on the design.

Gif showing me moving my hands closer and further apart to zoom in and out of a UI design

It's a pretty quick walkthrough but from there, any custom gesture from the frontend can be sent to Figma and used to trigger layers, create shapes, change colors, etc!

Having to run a separate app to be able to do this is not optimal but I doubt Figma will ever enable access to the getUserMedia Web API in a plugin so in the meantime, that was an interesting workaround to figure out!

Comments 13 total

  • Ben Halpern
    Ben HalpernJan 20, 2022

    Whoa

  • Sherry Day
    Sherry DayJan 20, 2022

    This is so cool

  • Nick Taylor
    Nick TaylorJan 21, 2022

    This is so so cool!

  • Vaibhav Khulbe
    Vaibhav KhulbeJan 21, 2022

    Crazyy!

  • Giacomo Rebonato
    Giacomo RebonatoJan 21, 2022

    So cool!

  • charles
    charlesJan 22, 2022

    So cool

  • lepinekong
    lepinekongJan 24, 2022

    "access to the camera and microphone isn't allowed, for security reasons" : yeah that sucks even for end user it overcomplicates thing, not sure figma did so really for security reason because they sell audio chat maybe they don't want a plugin which would do the same ;)

  • david050708
    david050708Jan 26, 2022

    Super project I love it

  • kamal ganwani
    kamal ganwaniJan 26, 2022

    the future is here

  • Devluc
    DevlucJan 27, 2022

    It's amazing to see you control Figma with hand gestures. Awesome project

  • Rosa paula
    Rosa paulaJan 28, 2022

    online gaming describes any computer game that gives on-line interactions with different players. Video games accustomed be classified by {an on-line|a web|an internet} Content PEGI descriptor to indicate whether or not they were online or not. However, as most games currently give on-line interactions this distinction is not any longer used.

  • Thomas Bnt
    Thomas BntMar 3, 2022

    Woaah 😵💪🏼

  • ogbee
    ogbeeJul 14, 2022

    I wish we had this with the platform that I work with.

Add comment