Codethink has been collaborating with Silicon Highway and The Robot Studio to program a fully open-source, standalone robot hand. Our goal has been to produce a short, entertaining demo of the hand showcasing all its features. The hope is that this will help publicise the hand and encourage more people to join the community of robotics development. In the end we decided to create a “mimicking” demo, where the hand copies gestures that it sees through its camera. Here is the story of how we got there!
Background
The Robot Studio is a manufacturer focused on the development of general-purpose humanoid robots. In contrast to the high-precision robotics found in industry, The Robot Studio is keen to make their products as human-like as possible. This makes it easier to integrate them into human-friendly environments and automate tasks therein.
AI will play an important role in achieving this goal. The robots need to be able to respond and adapt to the changing environment around them, just as humans do, rather than having their tasks hard-coded in. As such, Silicon Highway has embedded NVIDIA hardware into the robots, allowing them to use GPU technology to implement AI techniques such as computer vision. Silicon Highway has brought to the project their understanding of how AI will evolve in the future after supporting the embedded markets for over 20 years.
The exciting aspect of making the Robot Nano Hand open-source is its potential to grow and energise the robotics community. The idea is that, by creating a design that anyone can download and print off themselves, the barrier to new developers entering the field of robotics will be greatly reduced.
As a first step towards achieving this goal, the studio has built a standalone robot hand. The hand consists in 3D-printed plastic with servos embedded in it. These servos are rotary actuators and are able to rotate through a range of roughly 270°. There are two servos per finger and thumb: one for curling it (by pulling on a string “tendon”) and one for wiggling it from side to side. There is also a servo in the wrist allowing the hand to swing back and forth. Finally, there is a camera fixed into the palm.
The whole hand is then mounted onto a base containing an NVIDIA Jetson Nano developer kit, which is a small computing board with a lightweight version of the Tegra X1 SoC. This SoC provides 4 ARM CPUs @ 1.43 GHz and 128 Maxwell-based GPU cores. The goal was to use this Nano to control the servos in the hand. The Nano’s small size means the hand retains its portability while the GPU allows us to implement, for example, computer vision using the camera in the palm.
It’s alive!
Having built the hand the next task was to write the software to actually control it. Enter Codethink! Our commitment to open-source software meant we were well placed to help with the development of this project. We also have a long history of working with both Jetson hardware and NVIDIA CPUs, from building operating systems for the first devboards to implementing a multi-camera sports tracking system with the use of a TX2 SoC. In fact, we are part of the NVIDIA Jetson Ecosystem.
With regards to programming the hand, the first challenge was to get the Nano talking to the servos. This required communicating with them over serial protocol RS485. Thankfully the servos came with a board containing a CH341 chip, capable of converting serial signal into USB. Hence, we could plug the servos into the board and then the board into one of the USB ports on the Nano, resulting in a working channel of communication between the Nano and the hand.
Having set up this channel we now needed to send information across it. The goal was to be able to instruct a given servo to rotate to a given position or to query its current position. Any servo rotation corresponds to a finger curling or wiggling, so such functionality would give us full control of the hand and let us determine whatever pose it was currently in. We used as a starting point an open-source Arduino library (released under Apache license v2.0) that was written by the servo manufacturers, Feetech. The library provided all the functionality we required. For example, it contained a function that took as input a servo's numerical ID and a position (represented as a number between roughly 300-700). The function then would send the corresponding bits to the servo that would rotate it to the associated position. This was so close to what we needed but the library was intended for use with an Arduino and depended on the Arduino SDK. This was of course not fit for our purposes as we were using a Jetson Nano, not an Arduino!
Thankfully, the majority of the Arduino-dependent part of this library was the part responsible for opening a stream with the USB device (which appeared under a file descriptor of the form /dev/ttyUSB0
). Using the C Posix library (in particular <fcntl.h>
) we were able to reproduce the stream without the Arduino dependency. With this modification we could now use the original servo library without an Arduino and so fully control the hand - hurray! See below for a video of the hand telling you you cannot do that thing you want to do.
Next we wanted to adjust the library to make it more friendly for use with the hand. For example, we changed it so that the servo position was specified as a percentage between its range rather than the absolute value. We also refactored the code so that the numerical ID of servos is stored internally. This allowed us to define functions with names that referenced specific fingers, such as “curl_index” and “wiggle_pinky” - hopefully making the code more readable/intuitive. These functions take as input the position and how long it should take to reach that position (in milliseconds). For example, to rotate the servo used for curling the index finger 70% across its range in 500 milliseconds one could call: curl_index(0.7, 500)
Having made these changes, we now just needed to give the library a name. We went with "hamsa", after the religious symbol of a hand with an eye on the palm.
Programming the demo
Having gained control of the hand we now needed to build the actual demo. As stated earlier, the goal of the demo was to entice new robotics developers to have a go at building the hand. Ideally they would see the demo and think “Oh that’s cool, I wanna give it a go”. Hence it needed to be both entertaining but also easily installed by those with limited programming experience.
Thanks to the Nano’s onboard GPU we had scope to implement some machine-learning based computer vision with the camera. Conveniently, NVIDIA have already developed a series of image recognition models for both real-time body and hand pose estimation. They have been optimised specifically for use on Jetson platforms (using the TensorRT framework) and so provided us with a natural starting point for the demo.
The hand-pose estimation model (called trt_pose_hand
) is capable of detecting from a camera feed whether a human hand is in one of five poses:
-
Open palm ️ 🖐️
-
Index pointing ☝
-
Peace ✌️
-
OK 👌
-
Fist ✊
The robot hand is capable of performing all these poses and so we had the idea to have the robot mimic these if the trt_pose_hand
model detects them through the camera feed. The peace gesture is particularly useful as it lets us demonstrate ‘wiggle’ servos. These servos allow us to separate the index and middle finger so that the gesture actually looks like peace and not just upwards gun fingers.
trt_pse_hand is based on PyTorch and so uses a python interface. Hence, the first step was to write python bindings for the C++ hamsa
library. This was relatively simple with pybind11
. For example, to create bindings for the functions mentioned, curl_index
and wiggle_pinky
, we used C++ code of the following form:
#include <pybind11/pybind11.h>
#include "Hand.h"
namespace py = pybind11;
PYBIND11_MODULE(firmware, m) {
py::class_<Hand>(m, "Hand")
.def(py::init<>())
.def("wiggle_pinky", &Hand::wiggle_pinky)
.def("curl_index", &Hand::curl_index)
}
Here firmware
is the name of the python module being created and Hand
is the class to which the wiggle and curl functions belong to. See src/binding/binder.cpp
for the full set of bindings (which also includes the functions for querying servo positions).
Next we refactored the demo jupyter notebook from trt_pose_hand
so that it could be imported as a module into our own code. With this and the bindings in place we could now create a program that waits for trt_pose_hand
to detect a pose and then sends the appropriate signals to the robot hand so that it mimics said pose.
And thus the demo was built! However, we noticed quite quickly that the performance of the image recognition could be a bit temperamental. In particular it could often struggle to distinguish between ‘peace’ and ‘index pointing’. Queue a lot of scenes such as this:
With a bit of experimenting we found that lighting seemed to be having the dominant effect on performance. If, for example, the human hand is backlit by a strong light source such as a window, trt_pose_hand
can really struggle. In contrast, if it is frontlit by the same light source the accuracy of the pose estimation was near flawless. See below for an example of the demo in such conditions:
Future work
The hamsa
library is fully open source (under Apache License, Version 2.0) and can be found here. Anyone is free to download it and start experimenting with their own robot hand. The library is divided into two modules:
- the firmware required for controlling the hand
- the software required for running the demo.
With both of these you can install and run the mimicking demo on your own hand. There is also the potential to customise the demo so that, for example, it uses a model that can detect a wider range of gestures or so that the hand responds differently to the gestures that it does detect.
Having said that, you are not at all limited to working with our own demo. The firmware can be run independently of the demo and gives full control of the robot hand, allowing you to do whatever you want with the hand!
Learn more about the Robot Nano Hand project here >>
Stay up to date on our latest news about embedded
Receive our most relevant Embedded stories in your inbox.
Related to the blog post:
Other Content
- Codethink/Arm White Paper: Arm STLs at Runtime on Linux
- Speed Up Embedded Software Testing with QEMU
- Open Source Summit Europe (OSSEU) 2024
- Watch: Real-time Scheduling Fault Simulation
- Improving systemd’s integration testing infrastructure (part 2)
- Meet the Team: Laurence Urhegyi
- A new way to develop on Linux - Part II
- Shaping the future of GNOME: GUADEC 2024
- Developing a cryptographically secure bootloader for RISC-V in Rust
- Meet the Team: Philip Martin
- Improving systemd’s integration testing infrastructure (part 1)
- A new way to develop on Linux
- RISC-V Summit Europe 2024
- Safety Frontier: A Retrospective on ELISA
- Codethink sponsors Outreachy
- The Linux kernel is a CNA - so what?
- GNOME OS + systemd-sysupdate
- Codethink has achieved ISO 9001:2015 accreditation
- Outreachy internship: Improving end-to-end testing for GNOME
- Lessons learnt from building a distributed system in Rust
- FOSDEM 2024
- QAnvas and QAD: Streamlining UI Testing for Embedded Systems
- Outreachy: Supporting the open source community through mentorship programmes
- Using Git LFS and fast-import together
- Testing in a Box: Streamlining Embedded Systems Testing
- SDV Europe: What Codethink has planned
- How do Hardware Security Modules impact the automotive sector? The final blog in a three part discussion
- How do Hardware Security Modules impact the automotive sector? Part two of a three part discussion
- How do Hardware Security Modules impact the automotive sector? Part one of a three part discussion
- Automated Kernel Testing on RISC-V Hardware
- Automated end-to-end testing for Android Automotive on Hardware
- GUADEC 2023
- Embedded Open Source Summit 2023
- RISC-V: Exploring a Bug in Stack Unwinding
- Adding RISC-V Vector Cryptography Extension support to QEMU
- Introducing Our New Open-Source Tool: Quality Assurance Daemon
- Achieving Long-Term Maintainability with Open Source
- FOSDEM 2023
- Think before you Pip
- BuildStream 2.0 is here, just in time for the holidays!
- A Valuable & Comprehensive Firmware Code Review by Codethink
- GNOME OS & Atomic Upgrades on the PinePhone
- Flathub-Codethink Collaboration
- Codethink proudly sponsors GUADEC 2022
- Tracking Down an Obscure Reproducibility Bug in glibc
- Web app test automation with `cdt`
- FOSDEM Testing and Automation talk
- Protecting your project from dependency access problems
- Full archive