Academician Chen Runsheng, Chinese Academy of Sciences: There are "emergence" and "epiphany" phenomena in the big language model.

  Guangming net news Since the beginning of this year, large language models such as ChatGPT and ERNIE Bot have been popular, and the question of whether artificial intelligence (AI) can surpass human beings has aroused heated discussion. On May 29th, the "2023 China Computing Power Development Seminar" organized by China Intelligent Computing Industry Alliance was held in the Institute of Computing Technology, Chinese Academy of Sciences. The theme of this seminar is the opportunities and challenges of computing power under ChatGPT, which brings together many authoritative experts and scholars in the industry to conduct in-depth exchanges and discussions from multiple dimensions such as technology and ecology, determine the development trend of the liquidation power industry, and give countermeasures for the current artificial intelligence infrastructure and computing power service construction.

Academician Chen Runsheng, Chinese Academy of Sciences: There are "emergence" and "epiphany" phenomena in the big language model.

  At the seminar, Chen Runsheng, an academician of China Academy of Sciences, said that the development of artificial intelligence is unstoppable by human beings, which is the essence of scientific development. At the same time, he pointed out that the phenomena of "Breakthrough" and "Grokking" in the big language model deserve our consideration.

  What is "emergence"? A complex system is made up of many tiny individuals, who come together and interact with each other. When the number is large enough, it shows a special phenomenon that can’t be explained by micro-individuals, which is called "emergence". Chen Runsheng vividly explained, "I gave it (the big model) a lot of learning data, and as a result, there will be something in its answer that is not in the learning data. This phenomenon is called emergence." The calculation of a large model shows that when the training data is large (for example, more than 100 billion), there will be an emergence phenomenon, and this phenomenon will not occur when the scale is small.

  What needs to be clear is that the emerging phenomenon is controversial in the scientific community at present. For example, a professor at Stanford University thinks this is a metric problem, which involves measurement, basic physical coordinate system and so on.

  "Emergence in the process of natural language processing due to the rapid increase of the whole computation is a new problem, which is worth considering," Chen Runsheng said.

  What is an epiphany? Chen Runsheng explained, "In the process of training a neural network, you don’t understand it once, twice, the fourth time, and the fifth time. Just like a child learning something, you don’t understand it once or twice, and you suddenly learn it after teaching N+1 times."

  He believes that this is a model of human brain learning, "learning to understand at a certain time", "computers can’t have epiphany, but large models will have epiphany."

  Not long ago, Claude, one of ChatGPT’s main competitors, expanded the number of tokens in the context window to 100,000, equivalent to 75,000 words, which greatly exceeded the 8192 tokens in the context window of GPT-4. This means that users can upload up to 500 pages of documents to Claude, which can understand and digest the information in less than 1 minute and answer users’ questions based on the uploaded information.

  Claude was launched by Anthropic, a company founded by former OpenAI employees. Since the end of 2022, Google has invested nearly $400 million in the company.

  In this regard, Chen Runsheng believes that the current learning speed of large models is much faster than we thought. "These two companies (OpenAI and Anthropic) are catching up with each other. Maybe after a while, GPT-5 will be stronger than Claude, and it will develop so fast that people will not catch up in the future."

  "What’s more troublesome is that these big model companies are considering manipulating third-party equipment," Chen Runsheng said. "Being able to manipulate third-party equipment is a worrying thing. If you manipulate things related to safety and national defense, it will be terrible."

  Chen Runsheng admits that the structure of human neural network is much more complicated than the current big model, and the development of artificial intelligence still has a long way to go. "The current (artificial intelligence) neural network should have revolutionary changes in the spatial structure model. Maybe at that time, (AI) can really surpass human intelligence." (Reporter Zhan Zhao)