Recently, many domestic companies have achieved technological breakthroughs in "large model + robot".
The industry believes that with the continuous progress of technology and the expansion of application scenarios, the demand for multi-modal large models and robots will continue to increase, providing a broad market space for enterprises. In addition, cooperation with other industries will also bring new opportunities for the development of multi-modal large models and robots, such as cooperation with medical, manufacturing and other industries, which can achieve a wider range of application scenarios and business value.
Multimodal robots achieve technological breakthroughs.
As of December 13th**, many robot concept stocks such as Buke shares, Efort, and green harmonic rose more than 4%. On the news side, Tesla released the Optimus-Gen 2 (second-generation Optimus Prime) humanoid robot**, which is equipped with actuators and sensors designed by Tesla, which increases walking speed by 30%, and improves balance and whole body control.
"Multimodal" AI refers to a large model that can process multiple forms of content such as text, audio, images, and **. With the rapid iteration of multi-modal large models, international manufacturers continue to pay attention to their applications in the field of robotics, and have explored the main tasks such as robot planning, control, and navigation.
He Li, general manager of Zhishan Investment, told the reporter: "The multi-modal large model integrates vision, speech and sensor data processing technology, which greatly enriches the cognitive and decision-making level of robots. The application of this technology to robots is expected to enable significant advances in the fields of complex interactions, natural language understanding, and environmental adaptation, and to stimulate their unlimited possibilities as highly autonomous assistants or laborers. ”
There are already domestic enterprises that have taken the lead in this field. On the evening of December 12, Obi Zhongguang released a large model robotic arm 10 products, which can use voice prompts as input, use the comprehension ability and visual perception ability of a variety of large models to generate spatial semantic information, so that the robotic arm can understand and perform actions. In its simultaneous disclosure, the robotic arm successfully completed a series of voice commands, including "put the green square in the yellow box" and "please return to the original state".
Xiao Zhenzhong, co-founder and CTO of Obi Zhongguang, told the reporter: "The company hopes to make the large model robotic arm land in the actual scene through engineering research, including improving the ability of the robotic arm to automatically bypass complex obstacles to complete the first class of instructions, solve the generalization problem of large model + robotic arm, and finally realize the landing of general scenarios." ”
According to incomplete statistics, listed companies such as Thunderda, Yijiahe and other listed companies have recently disclosed the research and development progress of robots based on multi-modal large models.
It will still take time for large-scale commercial adoption.
China's robot industry has a certain industrial foundation. Modal robots with smart minds and much more flexible limbs are becoming a new track for multiple parties to compete for the future industry.
He Li believes that in the domestic market, the company has actively invested in the research and development and production of key technical links, especially in the fields of sensors, precision mechanical components, actuators, innovative materials and lightweight structural parts, showing a vigorous development momentum.
The harmonic reducer is the core component of industrial robots. Green harmonic disclosure has completed the research and development of industrial robot harmonic reducer technology and achieved large-scale production earlier, taking the lead in substituting imported products in this field, which greatly reduces the procurement cost and procurement cycle of domestic robot enterprises. The new generation of Y series harmonic reducer launched by it has doubled its stiffness index compared with other existing products through mathematical model innovation, bearing design and processing technology optimization.
However, some industry insiders believe that "multimodal + robot" is still in the development stage, and there are still many challenges to achieve commercialization.
First of all, the technology maturity is low, and there are technical bottlenecks. For example, the interpretability, stability, and security of the model need to be further improvedSecondly, the R&D and production costs of large models and robots are high, and the maintenance and operating costs are relatively high, requiring a lot of human and material resources. Guo Tao said.
Xiao Zhenzhong agreed, he told the reporter: "Large language model (LLM) combined with visual sensing will allow all kinds of robots and robotic arms to land in more scenarios, such as industrial manufacturing, flexible logistics, commercial services, etc." At present, there is still a certain gap between the combination of large models and actual data, and the computing power consumed by large models is also large, and it takes three to five years for applications to be gradually implemented, and it may take longer for the business to mature. ”
But the company firmly believes that this is the right direction and has a bright future. Xiao Zhenzhong said that Obi Zhongguang is building a robot and AI vision platform, through the research and development of multi-modal vision models and intelligent algorithms, combined with robot vision sensors, to form a complete product solution for autonomous mobile positioning, navigation and obstacle avoidance, and actively welcome the era of intelligent robots.