DAIMON Robotics Wants to Give Robot Hands a Sense of Touch
Revolutionizing Robot Hands: DAIMON Robotics' Vision-Based Tactile Sensing
In a breakthrough that could transform the field of robotics, Hong Kong-based DAIMON Robotics has released Daimon-Infinity, the world's largest omni-modal robotic dataset for physical AI. This comprehensive dataset features high-resolution tactile sensing and spans a wide range of tasks, from folding laundry at home to manufacturing on factory assembly lines. The project is supported by collaborative efforts of partners across China and the globe, including Google DeepMind, Northwestern University, and the National University of Singapore.
The Dataset Initiative: A Game-Changer for Embodied AI
DAIMON Robotics has been around for almost two and a half years, committed to developing high-resolution, multimodal tactile sensing devices to perceive the interaction between a robot's hand (particularly its fingertips) and objects. Their devices have become quite robust, accepted and used by a large segment of users, including academic and research institutes as well as leading humanoid robotics companies. As embodied AI continues to advance, the critical role of data has been clearer. Data scarcity remains a primary bottleneck in robot learning, particularly the lack of physical interaction data, which is essential for robots to operate effectively in the real world.
Building Large-Scale Robot Manipulation Datasets
DAIMON excels in high-quality, multimodal tactile data, capturing deformation, slip and friction, material properties and surface textures, enabling a comprehensive reconstruction of physical interactions. Building on their expertise in multimodal fusion, they have developed a robust data processing pipeline that seamlessly integrates tactile feedback with vision, motion trajectories, and natural language, transforming raw inputs into training-ready dataset for machine learning models. Recognizing the industry-wide data gap, DAIMON views large-scale data collection not only as their unique competitive advantage, but as a responsibility to the broader community.
From VLA to VTLA: The Importance of Tactile Sensing
The mainstream paradigm in robotics is currently the Vision-Language-Action (VLA) model, but DAIMON's team has proposed a Vision-Tactile-Language-Action (VTLA) model. This is necessary to incorporate tactile sensing, which enables robots to achieve robust manipulation capabilities and evolve into reliable partners for humans. Without tactile sensing, robots are severely limited, struggling to locate objects in dark environments, and without slip detection, they can easily drop fragile items like glass.
Monochromatic Vision-Based Tactile Sensing: A Technical Breakthrough
DAIMON's team has spent many years deeply engaged in vision-based tactile sensing and has developed the world's first monochromatic vision-based tactile sensing technology. This technology closely mimics what we have under our fingertip skin, knowing what we touch, what kind of material it is, how forces are distributed, and whether it is moving into the right position as our brain controls our hands. The key features of their sensors are the density of distributed force measurement and the deformation they can capture over the area of a fingertip.
Business Model and Commercial Strategy
DAIMON's business strategy is best described as "3D": Devices, Data, and Deployment. They build devices for data collection, their own ecosystem, and for deploying them in their partners' potential application domains. This enables the collection of real-world tactile-rich data and complete closed-loop validation. This will become an integral part of the 3D business model. Most startups in this space are following a similar path until eventually some may become more specialized or more tightly integrated with other companies.
Embodied Skills and the Convergence Moment
DAIMON's team has introduced the concept of "embodied skills" as essential for humanoid robots to move beyond having just an advanced AI "brain." This is prompted by the rapid evolution of models and hardware over the past two years, where electrical, electronic, and mechatronic hardware technologies have advanced tremendously. Robots are now fully electric, do not require hydraulics, because hardware has evolved rapidly. Modern electronics provide tremendous bandwidth with high torques. If we can build intelligence into these systems, we can create truly humanoid robots with the ability to operate in unstructured environments, make decisions, and take actions autonomously.
The Road to Real-World Deployment
The road toward large-scale deployment of generalist robots is still long, but we are starting to see signs of feasibility within specific domains. It is very similar to autonomous vehicles, where we are yet to see full deployment of robo-taxis, while we have already started to find mobile robots and smaller vehicles widely deployed in the hospitality industry. Virtually every major hotel in China now has a delivery robot — no arms, just a vehicle that picks up items from the hotel lobby (e.g., food deliveries). The delivery person just loads the food and selects the room number. It is up to the robot thereafter to navigate and reach the guest's room, which includes using the elevator, to deliver the food.
Conclusion
DAIMON Robotics' Daimon-Infinity dataset is a game-changer for the field of robotics, providing a comprehensive and high-resolution tactile sensing dataset that can be used to train and deploy robots in a wide range of tasks. The company's vision-based tactile sensing technology is a technical breakthrough that can transform the field of robotics, enabling robots to achieve robust manipulation capabilities and evolve into reliable partners for humans. As the field of robotics continues to evolve, we can expect to see more robots being deployed in real-world applications, from hotels and restaurants to convenience stores and hospitals.
Source: https://spectrum.ieee.org/daimon-robotics-physical-ai




