Automation Research Team


Human-robot collaboration and co-evolution

Human robot collaboration system based on a digital twin

Japan’s population distribution is drastically changing. Working-age population (15-64) are aging and becoming 65+. On the other hand, young population (0-14) are decreasing. This represents a crisis of keeping Japan’s “labor productivity”. But this is not a unique problem to Japan, many other countries will face the same crisis. To tackle this problem, we propose the approach: coexistence of “Easier to work” and productivity by human-robot mutual aid based on a human digital twin. We collaborate with the Digital Human Research Team (DHRT). Their platform “DHaibaworks” can obtain and display the worker’s workload in real time. We developed a digital twin system which bridges DHaibaworks and ROS. We use the digital twin system for an industrial parts picking task with the special collaboration of Toyota Motor Corporation. The robot can understand the human workload and do a picking task which represents a high workload for the human. Then, 10-15% higher productivity and 10% less human workload are realized. This is an important approach for “aging” countries and also bridging a gap between industrial productivity and human diversity.

[1] Maruyama, T.; Ueshiba, T.; Tada, M.; Toda, H.; Endo, Y.; Domae, Y.; Nakabo, Y.; Mori, T.; Suita, K. Digital Twin-Driven Human Robot Collaboration Using a Digital Human. Sensors 2021, 21, 8266.

Hand Activity Detection for Hunan-Machine Collaboration

In recent years, it has become increasingly difficult to secure young people and skilled workers, but there are also many technical challenges to fully automate most of the manufacturing processes. Therefore, this research focuses on assembly tasks perform by humans and robots, and aims to construct a human-robot collaborative system in which the robot can provide an appropriate assistance according to the worker's situation. For this purpose, it is necessary for the robot to know what the human worker is doing. In this research we present a novel software architecture enabling the recognition of assembly actions from fine-grained hand motions. Unlike previous works that compel humans to wear ad-hoc devices or visual markers in the human body, our approach enables users to move without additional burdens. Modules developed are able to: (i) reconstruct the 3D motions of body and hands keypoints using multi-camera systems; (ii) recognize objects manipulated by humans, and (iii) analyze the relationship between the human motions and the manipulated objects. We implement different solutions based on OpenPose and Mediapipe for body and hand keypoint detection. Additionally, we discuss the suitability of these solutions for enabling real-time data processing. We also propose a novel method using Long Short-Term Memory (LSTM) deep neural networks to analyze the relationship between the detected human motions and manipulated objects. Experimental validations show the superiority of the proposed approach against previous works based on Hidden Markov Models (HMMs).

[1] K. Fukuda, I. G. Ramirez-Alpizar, N. Yamanobe, D. Petit, K. Nagata, and K. Harada. Recognition of Assembly Tasks Based on the Actions Associated to the Manipulated Objects. In 2019 IEEE/SICE International Symposium on System Integration (SII), pp. 193–198, Paris, France, 2019.
[2] K. Fukuda, N. Yamanobe, I. G. Ramirez-Alpizar, and K. Harada. Assembly Motion Recognition Framework Using Only Images. In 2020 IEEE/SICE International Symposium on System Integration (SII), pp. 1242–1247, Honolulu, Hawaii, USA, 2020.
[3] E. Coronado, K. Fukuda, I. G. Ramirez-Alpizar, N. Yamanobe, G. Venture, and K. Harada. Assembly Action Understanding from Fine-Grained Hand Motions, a Multi-camera and Deep Learning Approach. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2605–2611, Prague, Czech Republic, 2021.

Experience Augmentation

Hard-to-obtain Cross-Modal Inference and its Practical Application

Based on experiences and current sensations such as vision, humans can make inferences about states that are difficult to measure directly and perform smart handling of objects. For example, when we see an object, we can predict its stiffness. When we see a scene with objects piled up, we can roughly predict the forces acting on each object. If the robot can make such cross-modal inferences, it can perform more advanced tasks. The key idea for giving robots such capabilities is the use of simulation. Simulation allows us to generate information not usually available in the real world with other modalities. We call this methodology of augmenting machine intelligence based on experiences that are not available in reality “experience augmentation”. Our research team aims to train models that perform cross-modal inference using this experience augmentation and then generalize the models to apply them to real-world environments, leading to advanced object manipulation. As specific results, by estimating the stiffness of objects in a scene from vision, we realized picking operations with small object deformations and picking operations that actively utilize deformations. We are also working on picking operations that avoid applying large forces to surrounding objects by estimating the forces acting on piled objects. [Link]

[1] 牧原昂志, 堂前幸康, Ixchel G. Ramirez-Alpizar, 植芝俊夫, ''pix2stiffnessによる柔軟物体の把持位置検出,'' SSII2021(online), June, 2021.
[2] 牧原昂志, 堂前幸康, 片岡裕雄, Ixchel G. Ramirez-Alpizar, 原田研介, ''アピアランスからの物体柔軟性推定に基づく把持位置検出.'' SICE SI2021(online), Dec., 2021.
[3] Koshi Makihara, Yukiyasu Domae, Ixchel G. Ramirez-Alpizar, Toshio Ueshiba and Kensuke Harada. ''Grasp pose detection for deformable daily items by pix2stiffness estimation.'' Advanced Robotics, 36:12, 600-610, 2022.
[4] Ryo Hanai, Yukiyasu Domae, Ixchel G. Ramirez-Alpizar, Bruno Leme and Tetsuya Ogata. ''Force Map: Learning to Predict Contact Force Distribution from Vision.'' arXiv preprint arXiv:2304.05803, 2023.

Robotic automation of difficult tasks

Robot learning of difficult industrial tasks using simulation data without using real experimental data.

Training cost is too high to apply the approach to real industrial fields. Our team tackles this problem in collaboration with Prof. Harada laboratory (Osaka University). Our approach is based on using physics simulation and zero-shot learning from sim to real. As an example task, we show industrial picking of tangled object (see attached video). We train a deep learning model to understand “How are objects tangled when the robot picks objects from bins?“. Training can be done only in simulation. As a result, several industrial parts (which have complex shapes) can be picked by the robot without real world training data. We also solve an industrial difficult task, peg-in-hole, without real world training data. Our research results show significant cost saving for teaching robots.

[1] R. Matsumura, Y. Domae, W. Wan and K. Harada, ''Learning Based Robotic Bin-picking for Potentially Tangled Objects,'' 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Macau, China, 2019, pp. 7990-7997, doi: 10.1109/IROS40897.2019.8968295.
[2] C. C. Beltran-Hernandez, D. Petit, I. G. Ramirez-Alpizar, T. Nishi, S. Kikuchi, T. Matsubara, and K. Harada. ''Learning Force Control for Contact-rich Manipulation Tasks with Rigid Position-controlled Robots.'' IEEE Robotics and Automation Letters (with IROS option), 5(4):5709–5716, Oct. 2020.
[3] C. C. Beltran-Hernandez, D. Petit, I. G. Ramirez-Alpizar, and K. Harada. ''Variable Compliance Control for Robotic Peg-in-Hole Assembly: A Deep Reinforcement Learning Approach.'' Applied Sciences, 10(19):6923, Oct. 2020.
[4] C. C. Beltran-Hernandez, D. Petit, I. G. Ramirez-Alpizar, and K. Harada. ''Accelerating Robot Learning of Contact-Rich Manipulations: A Curriculum Learning Study.'' arXiv preprint arXiv:2204.12844, 2022.