Prof. Roni Katzir

"ChatGPT isn't even close to being intelligent”

ChatGPT's language model opened a door to a new world of computer uses in a field that until recently seemed impenetrable. However, according to Prof. Roni Katzir, head of the Lab for Computational Linguistics at Tel Aviv University, the technology is still "far from human cognition" and although it will have practical applications, he warns of the dangers of it being used by those who want to do harm

In 1773, the French engineer Jacques de Vaucanson introduced the "digesting duck". The machine was made of copper, plated with gold and included over four hundred parts designed to reproduce every bump and bone. The movements of the machine were modeled according to studies and were meant to imitate natural ducks, and the machine - so it was described - walked like a duck, quacked like a duck, stretched its neck, flapped its wings and ate corn from the operator's palm like a duck. Finally and the highlight - it defecated like a duck. Vaucanson explained that all the processes were "copied from nature" and the food was digested as it happens in "real animals" using a small chemical laboratory that he placed in the heart of the machine. He would later use his invention to prove to researchers in the field of physiology that the digestive process is actually mechanical.
The duck became a sensation and a testimony to the ability of a talented engineer to reproduce a mechanism that is at the base of the life process. The philosopher Voltaire, who was captivated by the blur that the duck embodied between natural and synthetic life, said at the time that if it weren't for the duck, there would be "nothing to remind us of the glory of France", and called Vaucanson "the rival of Prometheus", the Titan who gave man fire. Years later, the mechanism of the fraud was revealed: the corn did not continue down the neck but remained at the base of the mouth tube, and green-painted breadcrumbs were released from a separate container.
1 View gallery
פרופ' רוני קציר
פרופ' רוני קציר
Prof. Roni Katzir
(Photo: Avigail Uzi)
Exactly 250 years have passed since the digestive duck show, and engineers continue to invent machines and along the way, as is their habit, get confused by the machines they built, and poetically attribute to them what they cannot do - imitation of life. This time on the operating table are language, consciousness and human intelligence that are being reconceptualized using natural language models. The most famous of them — ChatGPT, grabbed the headlines and the public imagination in recent months, and according to some, became evidence of a machine that knows language and possesses an intelligence that is fundamentally similar to that of a human. "Given the breadth and depth of GPT-4's capabilities, we believe it can reasonably be considered an early (and still incomplete) version of an artificial general intelligence (AGI) system," a team of researchers from Microsoft, the largest investor in OpenAI, wrote about GPT-4. "[The models are] the most powerful technology that humanity has yet developed," OpenAI CEO Sam Altman said (apparently forgetting antibiotics, electricity, the light bulb, the Internet, etc.) of the technology that he believes will ultimately represent "the collective power, and creativity, and will of humanity."
"ChatGPT isn’t even close to being intelligent," Prof. Roni Katzir says emphatically in an interview with Calcalist. "We humans recognize patterns and generalize very well - in a sense we are born scientists - but ChatGPT and the other current models do not understand and are not good at finding generalizations. By the way, it is not that there is any reason to think that machines cannot become intelligent at some point in the future, but the current models are really not going in that direction. They reflect successful progress at the practical level, but nothing more."
Katzir, a theoretical and computational linguist who heads the Laboratory for Computational Linguistics at Tel Aviv University, is found these days alongside a handful of other linguists in a rearguard battle. As engineers build models like ChatGPT, Facebook's Galaxy or Google's Bard, he and his colleagues are forced to fend off increasing claims that what we thought about language, intelligence or how language is acquired is wrong. "Even after these models have read the entire Internet, huge amounts of information that are hard to imagine," adds Katzir and explains, "they still fail to understand simple aspects of syntax that children understand after a very short time."
The models called "large language models" (or LLMs) are statistical tools for predicting words in a sequence. That is, after training on large data sets, the models guess in a sophisticated way which word is likely to come after which word (and sentence after sentence). Today's models are particularly large and have been trained on data from diverse sources such as Wikipedia, collections of books and raw web page scans. This intensive learning allows the model, given a certain input text, to probabilistically determine what word it knows will come next. Although the model knows how to build convincing combinations, the result it produces is not related and does not intentionally communicate any idea about the world. Still, and at least according to OpenAI and others, these models show a glimmer of understanding or logic, a resemblance to human intelligence. These chats are already referred to as having "creativity", which apparently redefines human cognition, and at Google a programmer was fired after stating that the language model developed in the company was "conscious" and trying to hire a lawyer for it who would represent his interests. As mentioned, Microsoft explained in a press release that "GPT-4 achieved a form of general intelligence [as] evidenced by its core mental abilities (such as reasoning, creativity, and deduction)."
However, the idea that we are moving towards "artificial general intelligence" (AGI) is controversial, especially among linguists who explain that grammatical fluency, even if at a high level, is still a long way from a machine that can think. "The models are very good at training on a large corpora," Katzir continues, "each new generation of models gets more data to train on than the previous generation and also more artificial neurons, and they are very good at collecting statistics on observations of words. They don't really understand what these sequences say and don't know what they are talking about, but they are very good at gathering this information and creating statistics of what texts look like and then they throw all kinds of sequences back at us in a way that seems very convincing, because it is very similar to all kinds of things they have seen many times in all kinds of places. But the model doesn't understand anything, and it doesn't even try to understand. It's an engineering tool with a practical purpose, like auto-completion. We humans, on the other hand, organize the information we receive in a way that allows us to formulate interesting and new things, often in a way that really doesn't match the existing statistics, and we mostly know what we are talking about."
And if they learn more? Will they be able to master the language like we can?
"We humans come prepared in advance for this task, innately. We have a certain neural base that allows us to organize the information, and we have learning methods that allow us to learn well and generalize in systematic ways from little information. This is how children come to master language within the first few years of their life. These large engineering models are something very different. They train on what is equivalent to thousands of years of life, and still do not come close to the level of knowledge and understanding that children have. For example, every child who grew up in a Hebrew-speaking environment knows that the sentence 'The boy Dina saw yesterday and Yossi will meet tomorrow is Danny' is a good sentence in Hebrew, While 'the boy Dina saw you yesterday and Yossi will meet tomorrow is Danny' is a bad sentence. But when we tested a whole collection of current models, some of them trained on corpora that are orders of a magnitude larger than what children hear, the models preferred the second, worse sentence.
"It's not that the models can't learn the distinction between the two sentences given enough information. To the best of our knowledge they can. But the fact that they don't reach this distinction given the amount of information children receive shows that the models are very different from us. This is just one example of course. Linguistic research provides many more examples of this kind. For all their engineering sophistication, the models are simply too far from human cognition. They come to the learning task with a different representational system than ours and with a limited and different inference ability than ours, and the results reflect this, even after all the intense training they go through. Their abilities may be good enough for an engineering tool, but one shouldn't get confused and think that this is something that models us."
We have seen that bots have started to be used to write books and some fear that AI is becoming so powerful that it poses an existential threat to the human race.
"I think that books written with ChatGPT will simply be bad books, because the model doesn't really understand and isn't really original. Occasionally it might perhaps write seminar papers that at a superficial glance will look serious, but that's fine, that's not what worries me. It will only force the instructors to find better ways to assess students' knowledge and understanding. My concerns are different. One problem that is talked about a lot is that these models are environmentally damaging - their carbon footprint is very large. Also, because they are reading existing texts they reproduce stereotypes and other problematic views. Personally, I find it particularly worrying that a tool of this type can make it possible to spread fake information on an indescribable scale. Anyone who wants to can use it to flood social networks, Wikipedia and journals with false or biased information and biased or just confusing arguments. This can impair people's ability to understand reality and make informed decisions, and the result will damage society and harm the ability of democracies to exist.
"The danger right now is not that the model itself is intelligent and will want to do harm - this may happen in the future, but the existing models are not intelligent and do not want anything. The danger is that the model will be a useful tool for people who want to do harm. Such a tool requires regulation. Just like there is regulation regarding which virus is allowed to be engineered in laboratories, then regulation is also needed regarding these such things."

According to Katzir, the blurring between what these models do and human intelligence, whether intentional or not, reflects a conceptual confusion. To clarify the point, Katzir brings us back for a moment to the world of Vaucanson’s images. "Science and engineering are two different things. In most fields this is not something that confuses us. For example, we can do science and study how birds fly, and we can do engineering and build flying machines. Most of us know that these are two different things. But when it comes to language, many people get confused even though it's the same distinction. Linguistics, as a cognitive science, studies the mechanism that humans have in their heads and is responsible for the human linguistic ability. It's like studying birds. And engineering builds models like ChatGPT. It's like building flying machines. It can be very useful to build airplanes, but that’s something different, a completely separate project from understanding how language works in humans.
"Nobody thinks that because they built a better plane they have solved the question of how birds fly. Engineering is not an attempt to understand how the world works. It's an attempt to build a useful tool. In the case of the models like ChatGPT it's a tool that can help us write emails and lines of code faster. People admire ChatGPT because it's a significant improvement over previous engineering tools we've known. The texts it completes really look very good. I think there's also some pleasure in imagining that this thing is intelligent, though that's a mistake of course. Anyway, just as an airplane doesn’t explain birds to us, ChatGPT does not explain to us how human linguistic ability works."