Home Helper: Startup’s Robot Can Tidy Up a Messy House

Preferred Networks discussed its home cleaning robots at the GPU Technology Conference.
by Scott Martin

The robot rolls up to a towel, drops an arm to grasp it and then scoots along to release it in a laundry bin. It zips up to pens scattered across the floor to grab and then places them into a box.

Take a break, Roomba. Preferred Networks has been developing this home cleaning robot since early last year, and it’s moving us closer to a reality of robots as home helpers.

The company’s goal is to create intuitive and interactive personal robots capable of multiple tasks. It aims to launch them for consumers in Japan by 2023 and would like to be in the U.S. after that..

Tokyo-based Preferred Networks — Japan’s largest startup by valuation — this week discussed its home cleaning robot aimed at consumers at the GPU Technology Conference.

And it’s got skills.

The Preferred Networks home robot can take cleaning commands, understand hand gestures and recognize more than 300 objects.

It can map locations of objects in a room using object detection from convolutional neural networks the company developed. Plus, it can place items back where they belong and even tell forgetful humans where objects are located.

“We’re focusing on personal robots, allowing the robots to work in a home environment or maybe an office environment or maybe some restaurants or bars,” said Jun Hatori, a software engineer at Preferred Networks.

The robots were built on Toyota’s HSR platform for robotics and runs with a computer with NVIDIA GeForce GTX 1080 Ti.


Beefy Vision Brains

Its robot packs powerful object detection. The developers used the Open Images Detection Dataset Version 4, which included 1.7 million annotated images with 12 million bounding boxes.

Its base convolutional neural network model was trained using 512 NVIDIA V100 GPUs and won second prize at the Google AI Open Images Challenge in the object detection track in 2018.

But it still has training to do.

For that, they use the same  512 NVIDIA GPU cluster used in the Google competition — whose nodes are interconnected by Mellanox technology. For object detection, they use the ImageNet dataset. They collected domain-specific data for the robot’s room setting and the objects, which were used to do the data tuning on top of the base network.

“We only support 300 objects so far, and that’s not enough. The system needs to be able to recognize almost everything at home,” said Hatori.

Chatty Plus Helpful

The Preferred Networks robot can speak a reply to many commands. It can connect a human command to objects mapped out in a room as well. For example, the system can map a spoken question like “Where is the striped shirt?” and tell the user where it’s located in the room.

Developers have encoded the spoken commands with LSTM and mapped them to real-world objects in the mapped room.

The system combines language interpretation with gesture. Users can point to a location in the room and give a command like “put that there.” Then the system can incorporate the user’s gesture with the spoken command.

It’s just the start.

“We’d like to do more than the simple tidying up, probably much more — some other kinds of home chores as well,” Hatori said.