Building a desktop robotics research setup – dfdx labs<br>dfdx labs<br>independent R&D company
Imprint © 2023-2026 dfdx labs GmbH
By Matthias Plappert<br>June 17, 2026 · 28 min read
The finished setup in action: setting up a chessboard via teleoperation. Visible are the different camera feeds, the human operator, and the sensed robot state; the operator can switch between cameras.
Robotics research has become cheap and accessible enough that small teams, and even individuals, can now do meaningful research on real hardware. There are two reasons for this.
First, capable robot hardware has become dramatically more affordable: the physical setup below uses an industrial-grade arm, two cameras, and a full teleoperation setup while staying below €5,000.11This figure excludes VAT and the cost of compute.
Second, there is now a steady supply of publicly available foundation models that are suitable for robotics. Hugging Face’s LeRobot, for example, is built around the same idea of democratizing state-of-the-art robotics research.
I have some history with this. Between 2017 and 2020, I did robotic manipulation research at OpenAI, first on a humanoid hand and then on a tabletop. The tabletop setup I worked with around 2019/2020 was roughly an order of magnitude more expensive than the one described here. The comparison is not perfect, but the fact that this version is even in the same category of usefulness at this price point is the important change. Back then, this kind of work required a team of around 20 people. If my thesis is right, a single person at a desk should be able to get surprisingly far today.
So, to test this thesis, I’ve decided to just do it: I will spend the next several months doing independent research on robotic manipulation, and I will do it in the open. I don’t expect the main output to be papers or an open-source codebase.22I currently don’t plan to open-source the code described here. Maintaining an open-source project is real work, and I’d rather spend that time on research. This might change. What I care about here is the research log itself: what works, what fails, and what I learn from running the system.
This note covers step one: building the full foundation for doing research. The first half is about the physical setup: an industrial-grade robot arm, two cameras, and teleoperation in a package small enough to live next to my desk. The second half is about the software stack I wrote from scratch to operate it. The video above shows the result in action.
This is an experiment and the plan might change. But I’m excited.
Requirements
From past experience, I know that robotics research should be done on actual hardware, so step one is building a setup that I can experiment on. Before buying anything, I wrote down a few requirements. They apply to the system as a whole—the physical setup and the software that operates it:
Less than €10,000
Small enough that I can put it on or next to my desk
Parts readily available (no enterprise sales)
Easy to use via Python
Unopinionated about the software stack (since I want to build my own)
The €10,000 limit was not derived from a detailed estimate. At the time, I mostly did not know what the final system would cost. The number was useful as a ceiling: high enough that I would not have to optimize every component for price, but low enough that the setup would still be affordable enough for my scale.
These five constraints explain most of the decisions in the rest of this post.
Physical setup
I decided to build a setup for tabletop manipulation with a single arm. Tabletop manipulation is nice because it offers endless tasks of varying difficulty: for example, you can start with a basic single-object pick-and-place task but gradually move towards setting up a chessboard or assembling Lego, all within the same physical setup.33We had the same reasoning 6 years ago on the OpenAI Robotics team. After solving the Rubik’s cube, we moved towards a tabletop setup because it can support so many different tasks, and we were interested in general-purpose robotics.
I opted for a single robot arm instead of a bimanual setup for simplicity, space, and cost reasons. This choice, however, imposes some real limitations on what types of tasks I can do: for example, folding a shirt with a single arm is probably impossible. But a single arm still leaves plenty of interesting tabletop tasks, and it forces a useful kind of constraint: the policy has to compensate for missing hardware with behavior. It can push an object against another object or the table edge to hold it in place, reposition something before grasping it, or use the environment as part of the manipulation strategy. For now, that is exactly the regime I want to study.
For vision, I use a wrist-mounted camera and a stationary camera. A constraint I have here is space: I cannot build a fully integrated “robot cage” lab setup, which means that the positions of the cameras, the lighting...