Bring fingertip precision to your edge AI projects with real-time hand pose estimation on the powerful Luxonis OAK 4 (RVC4) camera. This setup tracks 2D hand landmarks from a live camera stream and visualizes them in your browser.
Whether you’re building gesture-based controls, sign language interpreters, or immersive AR experiences, this example shows how to get started with minimal setup and maximum results.
EXAMPLE
Here’s a short video of the Hand Pose Estimation system running live on the Luxonis OAK 4 (RVC4):
Real-time hand landmark tracking (21 points per hand)
Seamless web visualization in the browser
Easy to integrate into your apps or UI overlays
Low latency and all processing on-device
SETUP
1. Clone the example
This uses Git to clone the necessary files from the Luxonis examples repo:
git clone https://github.com/luxonis/oak-examples.git
cd oak-examples/neural-networks/pose-estimation/hand-pose
2. Start Python virtual environment
python3 -m venv venv
source venv/bin/activate
3. Install Python dependencies
Make sure you’re in the hand-pose directory and install requirements:
pip install -r requirements.txt
4. Connect to the OAK 4 camera
Make sure your OAK 4 PoE device is powered and connected to the same network as your computer.
If you haven’t already, install oakctl (the Luxonis command-line tool).
Connect to your camera:
oakctl list
oakctl connect <DEVICE_IP>
Run the app on device:
oakctl app run .
This will start the hand pose model and launch a local web server.
USAGE
Once the app is running, open your browser and go to:
http://<DEVICE_IP>:8082
You’ll see the live camera feed with:
Detected hands outlined in real-time
21 landmark points tracked and updated on every frame
Fast feedback for development and testing
WRAP-UP
With just a few lines of code and the power of Luxonis DepthAI, you can build intuitive gesture-based interfaces and explore human-computer interaction with ease. All inference runs locally on the OAK 4, meaning zero latency to the cloud and full control at the edge.
Let your applications respond to the wave of a hand, or even individual finger movements.
TROUBLESHOOTING