¡Ø º» »óǰÀº ¿µ¹® ÀÚ·á·Î Çѱ۰ú ¿µ¹® ¸ñÂ÷¿¡ ºÒÀÏÄ¡ÇÏ´Â ³»¿ëÀÌ ÀÖÀ» °æ¿ì ¿µ¹®À» ¿ì¼±ÇÕ´Ï´Ù. Á¤È®ÇÑ °ËÅ並 À§ÇØ ¿µ¹® ¸ñÂ÷¸¦ Âü°íÇØÁֽñ⠹ٶø´Ï´Ù.
ÇöÀç ÈÞ¸Ó³ëÀÌµå ·Îº¿ÀÇ ¹ßÀüÀº ½Ã°¢-¾ð¾î-Çൿ(VLA) ¸ðµ¨ÀÇ ÃÖÀûÈ, ¸ÖƼ¸ð´Þ µ¥ÀÌÅÍ ÅëÇÕ, Àΰ£ÀÇ Àǵµ¸¦ ÇØ¼®ÇÏ´Â ´É·Â»Ó¸¸ ¾Æ´Ï¶ó Áö½Ã ÀÌÇØ ´É·ÂÀÇ °È°¡ Áß½ÉÀÌ µÇ°í ÀÖ½À´Ï´Ù. ÈÆ·ÃÀº ¼¼°è ¸ðµ¨, Àΰ£ ºñµð¿À µ¥ÀÌÅÍ, VR ±â¹Ý ¿ø°Ý ÈÆ·Ã¿¡ Å©°Ô ÀÇÁ¸Çϰí, ÀνÄÀ» °ÈÇϱâ À§ÇØ 1 ÀÎĪ ½ÃÁ¡¿¡ Á¡Á¡ ´õ ÁßÁ¡À» µÓ´Ï´Ù. ÃÖÁ¾ ¸ñÇ¥´Â ¹ü¿ë ÈÞ¸Ó³ëÀ̵åÀÇ ½ÇÇöÀÌÁö¸¸, °³¹ßÀº ¿©ÀüÈ÷ Å« °úÁ¦¿¡ ÀÇÇØ Á¦¾àÀ» ¹Þ°í ÀÖÀ¸¸ç, À¯·´°ú ¹Ì±¹ ±â¾÷°ú Áß±¹ ±â¾÷Àº °¢°¢ ´Ù¸¥ ±â¼ú °æ·Î¸¦ Ãß±¸Çϰí ÀÖ½À´Ï´Ù.
»ùÇÃ
ÁÖ¿ä ÇÏÀ̶óÀÌÆ®
- ÈÞ¸Ó³ëÀÌµå ·Îº¿Àº ½Ã°¢-¾ð¾î-Çൿ(VLA) ¸ðµ¨ ÃÖÀûÈ¿Í ¸ÖƼ¸ð´Þ µ¥ÀÌÅÍ ÅëÇÕ °È¿¡ ÁßÁ¡À» µÎ°í ÀÖ½À´Ï´Ù.
- Áö½Ã ÀÌÇØ¿Í Àΰ£ÀÇ Àǵµ ÇØ¼®ÀÇ °³¼±Àº ÇÙ½É °³¹ß ºÐ¾ßÀÔ´Ï´Ù.
- ÈÆ·ÃÀº ¼¼°è ¸ðµ¨, Àΰ£ ºñµð¿À µ¥ÀÌÅÍ, VR ±â¹Ý ¿ø°Ý ÈÆ·Ã¿¡ Å©°Ô ÀÇÁ¸Çϰí ÀÖÀ¸¸ç, 1ÀÎĪ ½ÃÁ¡¿¡ ÁßÁ¡À» µÎ°í ÀÖ½À´Ï´Ù.
- ÃÖÁ¾ ¸ñÇ¥´Â ¹ü¿ë ÈÞ¸Ó³ëÀ̵åÀÇ ½ÇÇöÀÌÁö¸¸, ±â¼úÀûÀ¸·Î Å« °úÁ¦°¡ ³²¾ÆÀÖ½À´Ï´Ù.
- À¯·´°ú ¹Ì±¹, Áß±¹ ±â¾÷Àº ÀÌ¿¡ ´ëÇØ ¼·Î ´Ù¸¥ ±â¼úÀû °æ·Î¸¦ Ãß±¸Çϰí ÀÖ½À´Ï´Ù.
¸ñÂ÷
Á¦1Àå ·Îº¿ Áö°¢ÀÇ ÇÙ½ÉÀÌ µÇ´Â ½Ã°¢ ¸ðµ¨
Á¦2Àå ÈÞ¸Ó³ëÀÌµå ·Îº¿ ¸ðµ¨ °³¹ßÀÚÀÇ Àü·«Àû µ¿Çâ
Á¦3Àå TRIIÀÇ °ßÇØ
KSA 25.09.26
Current progress in humanoid robotics is centered on optimizing vision-language-action (VLA) models, integrating multimodal data, and enhancing instruction comprehension as well as the ability to interpret human intent. Training relies heavily on world models, human video data, and VR-based remote training, with increasing emphasis on first-person perspectives to strengthen perception. While the ultimate goal is to achieve general-purpose humanoids, development remains constrained by significant challenges, leading Western and Chinese companies to pursue divergent technological pathways.
SAMPLE VIEW
Key Highlights:
- Humanoid robotics focuses on optimizing vision-language-action (VLA) models and enhancing multimodal data integration.
- Improving instruction comprehension and human intent interpretation is a core development area.
- Training relies heavily on world models, human video data, and VR-based remote training, with growing emphasis on first-person perspectives.
- The ultimate goal is to achieve general-purpose humanoids, but major technical challenges persist.
- Western and Chinese companies are pursuing different technological pathways in response.
Table of Contents
1. Vision Models as the Core of Robotic Perception
- Figure 1: Humanoid Robot Model Operation Framework
- Figure 2: Training Data for Humanoid Robots
- Table 1: Comparison of First-Person and Third-Person View Algorithms
- Figure 3: Apple HAT Model Overview
- Table 2: Summary of First-Person Datasets
2. Strategic Moves by Humanoid Robot Model Developers
- Figure 4: ViLLA Architecture
3. TRIIs View