In addition to openai's dall-e 2, Google's palm, lamda 2, deepmind's chinchilla and flamingo, deepmind shows another large artificial intelligence model Gato , its performance exceeds the existing system.
However, deepmind's Gato is different: the model can't write words better, describe images better, play yadali games better, control the manipulator better, or locate in three-dimensional space better than other AI systems. But Gato can do everything.
The chief researcher of Google deepmind AI said that artificial intelligence is about to reach the level of human intelligence. Dr. Nando de Freitas said that the game of realizing artificial general intelligence (AGI) for decades is coming to an end. The AI model can be competent for various complex tasks, from building blocks to writing poetry.
Nando de Freitas described Gato as a "generic agent". Only by expanding its scale, it can create artificial intelligence comparable to human intelligence.
The next web previously published an article entitled "humans will never achieve AgI". The research director of deepmind responded that the realization of AgI is inevitable.
"Now it's all about scale! The game is over! It's all about making these models bigger, safer, more efficient, faster sampling, smarter memory, more modes, innovative data, online / offline... Solving these challenges is the key to AgI," he wrote in a tweet.
Deepmind trains transformer based models with images, text, proprioception, joint moments, keystrokes, and other discrete and continuous observations and actions to achieve a variety of versatile skills. In the training phase, all data are processed by the transformer network according to the tag sequence, which is similar to a large language model.
The team then tested Gato on 604 different tasks. In more than 450 of these tasks, the AI model achieved about 50% of the performance of experts in the benchmark. But this is far behind the professional artificial intelligence model that can reach the expert level.
Gato has only 1.18 billion parameters. Compared with the 175 billion parameter gpt-3, the huge 540 billion parameter palm model or the 70 billion parameter chinchilla, its volume is undoubtedly very small.
According to the team, this is mainly due to the response time of the Sawyer manipulator used - the larger model will be too slow to perform robot tasks under the current hardware and current architecture.
However, these limitations can be easily addressed through new hardware and architecture, the team said. A larger Gato model can be trained with more data and may better complete various tasks.
Ultimately, the team said, this could lead to a generic AI model that replaces specialized models - as the history of AI research shows. It quotes artificial intelligence researcher Richard Sutton as a "painful lesson" of his research, pointing out: "historically, general models that are better at using computing also tend to eventually go beyond more professional domain specific methods."
Deepmind also shows that the performance of Gato increases with the increase of the number of parameters. In addition to the large model, the team also trained two smaller models with parameters of 79 million and 364 million respectively. The average performance increases linearly with the increase of parameters -- at least for the benchmark tested.
When asked how far Gato AI is from passing the real Turing test (the standard for measuring computer intelligence, which requires humans to be unable to distinguish between a machine and another person), Alex dimikas, a machine learning researcher, replied "far away".
In response to further questions from AI researchers on twitter, Dr. de Freitas said that security is the most important "when developing AgI." "This is probably the biggest challenge we face," he wrote Everyone should think about it. The lack of sufficient diversity also worries me very much. "
For the description of Gato, deepmind indicates
This agent, which we call Gato, works as a general policy of multi-mode, multi task and multi component. The same network with the same weight can play yadali games, title pictures, chat, stack blocks with real robot arms, etc., and decide whether to output text, joint torque, button press or other markers according to its context.