Researchers from the Toyota Research Institute, Columbia University, and MIT have successfully trained robots to perform new tasks quickly using a technique called imitation learning combined with generative AI. They have made strides in expanding generative AI technology beyond text, images, and videos into the realm of robot movements.
Other companies, such as Covariant, have also utilized generative AI in their robotics projects. Covariant’s RFM-1 model is multimodal, capable of interpreting prompts in various forms such as text, images, videos, robot instructions, or measurements. This technology enables robots to understand instructions and generate corresponding images or videos.
One key factor in enhancing robot skills is the availability of data. While large AI models like GPT-4 benefit from vast amounts of internet data, robots require specific data collected for their use. Initiatives like the Open X-Embodiment Collaboration by Google DeepMind aim to address this issue by collecting data from various robots to enhance their learning capabilities.
Recent studies have shown that increased data leads to smarter robots. Researchers developed the RT-X model, available for local use or web access, which was pretrained with internet data to build a foundational visual understanding. When tested on different robots, the RT-X model demonstrated a 50% increase in successful skill acquisition compared to individual lab-developed systems.
For further insights, check out the full story.
Deeper Learning
Generative AI has the potential to transform cherished memories into realistic photos that never existed.
Take Maria, for example, who grew up in Barcelona during the 1940s. Her childhood memories of her father are clear, including visiting him in a neighbor’s apartment overlooking a prison where he was held for political reasons. While no photo exists of Maria on that balcony, she can now experience a simulated memory-based reconstruction through generative AI.