The University of Texas-based computer scientists have enabled an artificial intelligence agent to do something, which can be done only by humans.
This research has been carried out by a team led by professor Kristen Grauman, Ph.D. candidate Santhosh Ramakrishnan and former Ph.D. candidate Dinesh Jayaraman.
Mostly, the AI agents are trained for very specific tasks, like to recognize an object or to estimate its volume. But, the AI agent developed by Grauman and Ramakrishnan will gather visual information that can be used for a wide range of tasks.
“We want an agent that’s generally equipped to enter environments and be ready for new perception tasks as they arise,” Grauman said. “It behaves in a way that’s versatile and able to succeed at different tasks because it has learned useful patterns about the visual world.”
The scientists used deep learning, a type of machine learning inspired by the brain’s neural networks, to train their agent on thousands of 360-degree images of different environments.
“Just as you bring in prior information about the regularities that exist in previously experienced environments, like all the grocery stores you have ever been to, this agent searches in a nonexhaustive way,” Grauman said. “It learns to make intelligent guesses about where to gather visual information to succeed in perception tasks.”
One of the main challenges the scientists set for themselves was to design an agent that can work under tight time constraints. This would be critical in a search-and-rescue application. For example, in a burning building, a robot would be called upon to quickly locate people, flames and hazardous materials and relay that information to firefighters.
For now, the new agent operates like a person standing in one spot, with the ability to point a camera in any direction but not able to move to a new position. Or, equivalently, the agent could gaze upon an object it is holding and decide how to turn the object to inspect another side of it. Next, the researchers are developing the system further to work in a fully mobile robot.
Using the supercomputers at UT Austin’s Texas Advanced Computing Center and Department of Computer Science, it took about a day to train their agent using an artificial intelligence approach called reinforcement learning. The team, with Ramakrishnan’s leadership, developed a method for speeding up the training: building a second agent, called a sidekick, to assist the primary agent.
“Using extra information that’s present purely during training helps the (primary) agent learn faster,” Ramakrishnan said.