Precision home robots learn with real-to-sim-to-real

Rachel Gordon | MIT CSAIL • July 31, 2024

At the top of many automation wish lists is a particularly time-consuming task: chores. 

The moonshot of many roboticists is cooking up the proper hardware and software combination so that a machine can learn “generalist” policies (the rules and strategies that guide robot behavior) that work everywhere, under all conditions. Realistically, though, if you have a home robot, you probably don’t care much about it working for your neighbors. MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) researchers decided, with that in mind, to attempt to find a solution to easily train robust robot policies for very specific environments.

“We aim for robots to perform exceptionally well under disturbances, distractions, varying lighting conditions, and changes in object poses, all within a single environment,” says Marcel Torne Villasevil, MIT CSAIL research assistant in the Improbable AI lab and lead author on a recent paper about the work. “We propose a method to create digital twins on the fly using the latest advances in computer vision. With just their phones, anyone can capture a digital replica of the real world, and the robots can train in a simulated environment much faster than the real world, thanks to GPU parallelization. Our approach eliminates the need for extensive reward engineering by leveraging a few real-world demonstrations to jump-start the training process.”

Taking your robot home

RialTo, of course, is a little more complicated than just a simple wave of a phone and (boom!) home bot at your service. It begins by using your device to scan the target environment using tools like NeRFStudio, ARCode, or Polycam. Once the scene is reconstructed, users can upload it to RialTo’s interface to make detailed adjustments, add necessary joints to the robots, and more.

The refined scene is exported and brought into the simulator. Here, the aim is to develop a policy based on real-world actions and observations, such as one for grabbing a cup on a counter. These real-world demonstrations are replicated in the simulation, providing some valuable data for reinforcement learning. “This helps in creating a strong policy that works well in both the simulation and the real world. An enhanced algorithm using reinforcement learning helps guide this process, to ensure the policy is effective when applied outside of the simulator,” says Torne.

Testing showed that RialTo created strong policies for a variety of tasks, whether in controlled lab settings or more unpredictable real-world environments, improving 67 percent over imitation learning with the same number of demonstrations. The tasks involved opening a toaster, placing a book on a shelf, putting a plate on a rack, placing a mug on a shelf, opening a drawer, and opening a cabinet. For each task, the researchers tested the system’s performance under three increasing levels of difficulty: randomizing object poses, adding visual distractors, and applying physical disturbances during task executions. When paired with real-world data, the system outperformed traditional imitation-learning methods, especially in situations with lots of visual distractions or physical disruptions.

“These experiments show that if we care about being very robust to one particular environment, the best idea is to leverage digital twins instead of trying to obtain robustness with large-scale data collection in diverse environments,” says Pulkit Agrawal, director of Improbable AI Lab, MIT electrical engineering and computer science (EECS) associate professor, MIT CSAIL principal investigator, and senior author on the work.

As far as limitations, RialTo currently takes three days to be fully trained. To speed this up, the team mentions improving the underlying algorithms and using foundation models. Training in simulation also has its limitations, and currently it’s difficult to do effortless sim-to-real transfer and simulate deformable objects or liquids.

The next level

So what’s next for RialTo’s journey? Building on previous efforts, the scientists are working on preserving robustness against various disturbances while improving the model’s adaptability to new environments. “Our next endeavor is this approach to using pre-trained models, accelerating the learning process, minimizing human input, and achieving broader generalization capabilities,” says Torne.

“We’re incredibly enthusiastic about our 'on-the-fly' robot programming concept, where robots can autonomously scan their environment and learn how to solve specific tasks in simulation. While our current method has limitations — such as requiring a few initial demonstrations by a human and significant compute time for training these policies (up to three days) — we see it as a significant step towards achieving 'on-the-fly' robot learning and deployment,” says Torne. “This approach moves us closer to a future where robots won’t need a preexisting policy that covers every scenario. Instead, they can rapidly learn new tasks without extensive real-world interaction. In my view, this advancement could expedite the practical application of robotics far sooner than relying solely on a universal, all-encompassing policy.”

“To deploy robots in the real world, researchers have traditionally relied on methods such as imitation learning from expert data, which can be expensive, or reinforcement learning, which can be unsafe,” says Zoey Chen, a computer science PhD student at the University of Washington who wasn’t involved in the paper. “RialTo directly addresses both the safety constraints of real-world RL [robot learning], and efficient data constraints for data-driven learning methods, with its novel real-to-sim-to-real pipeline. This novel pipeline not only ensures safe and robust training in simulation before real-world deployment, but also significantly improves the efficiency of data collection. RialTo has the potential to significantly scale up robot learning and allows robots to adapt to complex real-world scenarios much more effectively.”

"Simulation has shown impressive capabilities on real robots by providing inexpensive, possibly infinite data for policy learning,” adds Marius Memmel, a computer science PhD student at the University of Washington who wasn’t involved in the work. “However, these methods are limited to a few specific scenarios, and constructing the corresponding simulations is expensive and laborious. RialTo provides an easy-to-use tool to reconstruct real-world environments in minutes instead of hours. Furthermore, it makes extensive use of collected demonstrations during policy learning, minimizing the burden on the operator and reducing the sim2real gap. RialTo demonstrates robustness to object poses and disturbances, showing incredible real-world performance without requiring extensive simulator construction and data collection.”

Torne wrote this paper alongside senior authors Abhishek Gupta, assistant professor at the University of Washington, and Agrawal. Four other CSAIL members are also credited: EECS PhD student Anthony Simeonov SM ’22, research assistant Zechu Li, undergraduate student April Chan, and Tao Chen PhD ’24. Improbable AI Lab and WEIRD Lab members also contributed valuable feedback and support in developing this project. 

This work was supported, in part, by the Sony Research Award, the U.S. government, and Hyundai Motor Co., with assistance from the WEIRD (Washington Embodied Intelligence and Robotics Development) Lab. The researchers presented their work at the Robotics Science and Systems (RSS) conference earlier this month.

A collage of four pictures of a yellow robot dog.
By Alex Shipps | MIT CSAIL August 8, 2024
A new algorithm helps robots practice skills like sweeping and placing objects, potentially helping them improve at important tasks in houses, hospitals, and factories.
A man wearing glasses and a blue shirt is smiling for the camera.
By Sara Feijo | MIT Open Learning August 8, 2024
Leveraging more than 35 years of experience at MIT, Bertsimas will work with partners across the Institute to transform teaching and learning on and off campus.
A bunch of green thermometer on a pink background.
By Adam Zewe | MIT News July 31, 2024
More efficient than other approaches, the “Thermometer” technique could help someone know when they should trust a large language model.
A bunch of dice are flying in the air in a dark room.
By Adam Zewe | MIT News July 24, 2024
Introducing structured randomization into decisions based on machine-learning model predictions can address inherent uncertainties while maintaining efficiency.
A computer generated image of a brain on a motherboard.
By Rachel Gordon | MIT CSAIL July 23, 2024
MAIA is a multimodal agent that can iteratively design experiments to better understand various components of AI systems.
A computer generated image of a molecule on a green background
By David L. Chandler | MIT News July 23, 2024
Analysis and materials identified by MIT engineers could lead to more energy-efficient fuel cells, electrolyzers, batteries, or computing devices.
A hand is touching a screen with its finger.
By Adam Zewe | MIT News July 23, 2024
A new study shows someone’s beliefs about an LLM play a significant role in the model’s performance and are important for how it is deployed.
A nurse is looking at a computer screen while a woman is getting a mammogram.
By Adam Zewe | MIT News July 22, 2024
The model could help clinicians assess breast cancer stage and ultimately help in reducing overtreatment.
A grid of colorful balls connected to each other on a white background.
By Poornima Apte | Department of Materials Science and Engineering July 18, 2024
An MIT team uses computer models to measure atomic patterns in metals, essential for designing custom materials for use in aerospace, biomedicine, electronics, and more.
A cartoon illustration of a drone flying over mountains.
By Alex Shipps | MIT CSAIL July 18, 2024
Neural network controllers provide complex robots with stability guarantees, paving the way for the safer deployment of autonomous vehicles and industrial machines.
More Posts