The technology that allows robots to handle deformable objects such as wires, clothing, and rubber bands has long been regarded as a key task in the automation of manufacturing and service industries. However, since such deformable objects do not have a fixed shape and their movements are difficult to predict, robots have faced great difficulties in accurately recognizing and manipulating them. KAIST researchers have developed a robot technology that can precisely grasp the state of deformable objects and handle them skillfully, even with incomplete visual information. This achievement is expected to contribute to intelligent automation in various industrial and service fields, including cable and wire assembly, manufacturing that handles soft components, and clothing organization and packaging.
KAIST (President Kwang Hyung Lee) announced on the 21st of August that the research team led by Professor Daehyung Park of the School of Computing developed an artificial intelligence technology called “INR-DOM (Implicit Neural-Representation for Deformable Object Manipulation),” which enables robots to skillfully handle objects whose shape continuously changes like elastic bands and which are visually difficult to distinguish.
Professor Park’s research team developed a technology that allows robots to completely reconstruct the overall shape of a deformable object from partially observed three-dimensional information and to learn manipulation strategies based on it. Additionally, the team introduced a new two-stage learning framework that combines reinforcement learning and contrastive learning so that robots can efficiently learn specific tasks. The trained controller achieved significantly higher task success rates compared to existing technologies in a simulation environment, and in real robot experiments, it demonstrated a high level of manipulation capability, such as untying complicatedly entangled rubber bands, thereby greatly expanding the applicability of robots in handling deformable objects.
Deformable Object Manipulation (DOM) is one of the long-standing challenges in robotics. This is because deformable objects have infinite degrees of freedom, making their movements difficult to predict, and the phenomenon of self-occlusion, in which the object hides parts of itself, makes it difficult for robots to grasp their overall state.
To solve these problems, representation methods of deformable object states and control technologies based on reinforcement learning have been widely studied. However, existing representation methods could not accurately represent continuously deforming surfaces or complex three-dimensional structures of deformable objects, and since state representation and reinforcement learning were separated, there was a limitation in constructing a suitable state representation space needed for object manipulation.
To overcome these limitations, the research team utilized “Implicit Neural Representation.” This technology receives partial three-dimensional information (point cloud*) observed by the robot and reconstructs the overall shape of the object, including unseen parts, as a continuous surface (signed distance function, SDF). This enables robots to imagine and understand the overall shape of the object just like humans.
*Point cloud 3D information: a method of representing the three-dimensional shape of an object as a “set of points” on its surface.
Furthermore, the research team introduced a two-stage learning framework. In the first stage of pre-training, a model is trained to reconstruct the complete shape from incomplete point cloud data, securing a state representation module that is robust to occlusion and capable of well representing the surfaces of stretching objects. In the second stage of fine-tuning, reinforcement learning and contrastive learning are used together to optimize the control policy and state representation module so that the robot can clearly distinguish subtle differences between the current state and the goal state and efficiently find the optimal action required for task execution.
When the INR-DOM technology developed by the research team was mounted on a robot and tested, it showed overwhelmingly higher success rates than the best existing technologies in three complex tasks in a simulation environment: inserting a rubber ring into a groove (sealing), installing an O-ring onto a part (installation), and untying tangled rubber bands (disentanglement). In particular, in the most challenging task, disentanglement, the success rate reached 75%, which was about 49% higher than the best existing technology (ACID, 26%).
The research team also verified that INR-DOM technology is applicable in real environments by combining sample-efficient robotic reinforcement learning with INR-DOM and performing reinforcement learning in a real-world environment.
As a result, in actual environments, the robot performed insertion, installation, and disentanglement tasks with a success rate of over 90%, and in particular, in the visually difficult bidirectional disentanglement task, it achieved a 25% higher success rate compared to existing image-based reinforcement learning methods, proving that robust manipulation is possible despite visual ambiguity.
Minseok Song, a master’s student and first author of this research, stated that “this research has shown the possibility that robots can understand the overall shape of deformable objects even with incomplete information and perform complex manipulation based on that understanding.” He added, “It will greatly contribute to the advancement of robot technology that performs sophisticated tasks in cooperation with humans or in place of humans in various fields such as manufacturing, logistics, and medicine.”
This study, with KAIST School of Computing master’s student Minseok Song as first author, was presented at the top international robotics conference, Robotics: Science and Systems (RSS) 2025, held June 21–25 at USC in Los Angeles.
※ Paper title: “Implicit Neural-Representation Learning for Elastic Deformable-Object Manipulations”
※ DOI: https://www.roboticsproceedings.org/ (to be released), currently https://arxiv.org/abs/2505.00500
This research was supported by the Ministry of Science and ICT through the Institute of Information & Communications Technology Planning & Evaluation (IITP)’s projects “Core Software Technology Development for Complex-Intelligence Autonomous Agents” (RS-2024-00336738; Development of Mission Execution Procedure Generation Technology for Autonomous Agents’ Complex Task Autonomy), “Core Technology Development for Human-Centered Artificial Intelligence” (RS-2022-II220311; Goal-Oriented Reinforcement Learning Technology for Multi-Contact Robot Manipulation of Everyday Objects), “Core Computing Technology” (RS-2024-00509279; Global AI Frontier Lab), as well as support from Samsung Electronics. More details can be found at https://inr-dom.github.io.