Robotic hand solves Rubik’s Cube by learning how to learn about it

Can you solve a Rubik’s Cube? How about with one hand?

That’s what artificial intelligence (AI) research company OpenAI has taught a robot to do: using neural networks but leaving it up to the system to figure out how to overcome hurdles, it’s taught a human-like, robotic hand to solve the puzzle single-handedly.

This isn’t the first time that a robot has solved the Rubik’s Cube. In June 2019, an MIT robot – fast as greased pistons, but not at all human-hand-like – did it in the record-shattering time of .38 seconds. (Compare that with the fastest record for a human, which is held by Yusheng Du, who solved it in 3.47 seconds in 2018.)

The company said on Tuesday that it’s been trying to train a human-like, robotic hand to solve the puzzle since May 2017. The company chose the task of training such a hand to solve a Rubik’s Cube because it’s a complex manipulation task that lays the groundwork for general-purpose robots to do all manner of other tasks.

OpenAI solved the Rubik’s Cube, in simulation, in July 2017. But as of July 2018, it had only managed to get the IRL robot to manipulate a block. Now, it’s reached its initial goal of teaching the robot to solve the puzzle – at least, some of the time.

Challenges

Solving a Rubik’s Cube can be tough even for humans. It requires a great deal of dexterity and can take years to master. OpenAI’s robotic hand is still perfecting its technique and is now solving the cube 60% of the time, at best, when it’s been scrambled with only 15 rotations. When scrambling the cube for maximum difficulty, with 26 face rotations, that drops to 20%.

The researchers didn’t tell the hand how to move in order to get to a solved cube. They did modify the cube slightly so it could tell which way up it was held: specifically, they cut out a small piece of each center cubelet’s colorful sticker so as to break what they called its “rotational symmetry.”

OpenAI says the biggest challenge was to create environments in simulation that were diverse enough to capture the physics of the real world, including friction on the fingers, how easy it is to turn the faces on the cube, or what the weight of the cube is, for example.

Techniques in robotics haven’t been able to scale to that complexity that we see in a robotic hand. Humans have evolved to be able to manipulate and operate our hands. So there’s a huge amount of learning that’s happened to get to this place as a species, and the robot has to learn all of this from scratch.

Instead of trying to write every single one of an infinite number of dedicated algorithms to operate the hand in an environment that throws up unpredictable hurdles, OpenAI took a different approach. The team created thousands of simulated environments and learned to do the task in all of them. But given that that you can’t possibly simulate every single complication that might arise when you’re solving tasks in the real, physical world, OpenAI created a new AI training method, called Automatic Domain Randomization (ADR), that endlessly generates progressively more difficult environments in simulation.

This frees us from having an accurate model of the real world, and enables the transfer of neural networks learned in simulation to be applied to the real world.

Every time the hand got good at it outside of the simulation, they threw in more disruptions, in order to make it learn how to eventually be robust at tasks in the real world. Disruptions like, say, putting a rubber glove on the hand. Or nudging it with another hand. Or poking it with a stuffed giraffe. As training progressed, they randomized all the parameters, such as the mass of the cube, the friction of the robot fingers, and the visual surface materials of the hand.

OpenAI researchers found that when trained with ADR, its system turned out “surprisingly robust” to having its task messed with, successfully dealing with situations that they’d never trained it to handle.

The robot can successfully perform most flips and face rotations under all tested perturbations, though not at peak performance.

What’s next?

In speaking with the BBC, Prof. Ken Goldberg, from UC Berkeley, said that OpenAI’s results shouldn’t be overstated, despite what he called its impressive act of “showmanship”.

The average human isn’t particularly good at solving Rubik’s cubes. So when they see a robot doing it, they say, ‘Well, this is better than a human’. But that’s a little deceptive, because games are not reality.

When it comes to robots taking away jobs from people whose innate dexterity enables them to perform complex tasks, Goldberg said that we can relax. That’s likely a few decades off in the future, he said.

We’re far from being able to replace kitchen workers who chop up vegetables, or even pick up and you know, do dishwashing. All those are very complex tasks.

To read more about its work, check out OpenAI’s paper. The BBC notes that the paper wasn’t peer-reviewed, though experts the publication spoke with didn’t dispute its details.