My Favourite Photo of Noam Chomsky and Where He Stands in the Big AI Debate on the Road to AGI
Noam Chomsky at a Deutsche Welle Conference in Germany where he spoke about "Reanimating Democracy". Image © Fazila Farouk My favourite photograph of Noam Chomsky is actually one that I took of him. It's just over a decade since the photo was taken, and how I managed to get it is quite a funny story. Chomsky was the keynote speaker at a conference in Germany, which I was lucky to attend. He spoke to what can best be described as a packed auditorium filled with the excited energy of an eager audience. Before getting onto the podium, Chomsky was guided to a seat in the front row where a large crowd gathered. This made it difficult for me to get a clear view when I approached for a photo. But I noticed an opening between people's legs and in an impulsive moment, got onto all fours, crawled through and managed to get in front of the crowd. It was an incredible moment for me. I could tell that Chomsky was surprised to see me pop up like a meerkat out of nowhere. But he instantly gave me a smile, and seizing the opportunity I snapped my photo and crawled back the way I came in.
I have always regretted not stretching out my arm for a handshake. But such was the experience—in that adrenalin fuelled moment, I wasn't really thinking beyond my photo-snatching mission. It was only once I got back to my seat that I looked at my phone and realized I'd managed to get a half-decent image of this great public intellectual and scholar activist.
Years later, my memory of his speech on "reanimating democracy" has faded, but not the sharpness of his critique. In this respect, Chomsky has long been a lodestar for dissident thought, blending scientific rigour with unflinching moral clarity.
So, in early 2024, when I didn't hear any commentary from him about the horrific genocide unfolding in Gaza, I wondered why this had been the case. By June 2024, the reasons for his uncharacteristic silence became clear when media reports started surfacing about him suffering a debilitating stroke that had robbed him of his speech. I received the news with sadness. The loss of his critical voice leaves a gaping hole in public discourse.
The Power of LLMs, AGI and AI Superintelligence
I'm a digital economy researcher who studies the political economy of artificial intelligence (AI) with a particular focus on the intersection of innovation and inequality. Chomsky, who made a name for himself as a scholar studying linguistics, has been on my mind ever since the introduction of ChatGPT, large language models (LLMs), and the associated hype about the imminent emergence of artificial general intelligence (AGI). An important aspect of the road to AGI is the debate on whether or not current AI systems are capable of reasoning, which this essay explores using Chomsky as an entry point.
First, however, a primer: Generative AI (GenAI) creates text, images, or code using machine learning. LLMs, a subset of GenAI, specialize in text generation. They’re powered by artificial neural networks—computational systems loosely inspired by the human brain—that detect patterns in vast datasets.
The public was introduced to LLMs when OpenAI launched ChatGPT in November 2022. Since then, we've witnessed the introduction of competing models, including Claude from Anthropic, Gemini from Google, and the latest major disruptor in the field, DeepSeek, from the Chinese hedge fund firm High-Flyer.
LLMs have proven to be a breakthrough innovation that help people think on their feet. As a result, we now find ourselves living in a world where humans have access to handy AI assistants doing all the heavy lifting on complex knowledge-based tasks—not just introducing efficiencies into our lives, but also making us look smarter. The caveat, of course, is that we've also entered an era where the efficiencies introduced by AI assistants are decreasing the amount of human labour required to complete knowledge-based tasks, thus leading to renewed concerns about the threat of AI cannibalizing an increasing array of jobs that sustain human livelihoods.
The latest wave of automation anxiety is heightened not just by the intelligent responses of LLMs but also by their seeming sentience, resulting in urgent questions about how intelligent they actually are and whether or not they have the ability to become aware. As a result, questions about AI's intelligence are often conflated with concerns about AI achieving consciousness. This issue is further entangled with anxiety about the existential threat AI poses to humanity.
At the heart of this concern is an inescapable comparison between human and artificial intelligence. And, in this regard, there's already much talk about the next big AI development being artificial general intelligence. AGI is described as an AI system that will have the general cognitive abilities of human intelligence. Some researchers believe that AGI could be developed as early as the 2030s, if not sooner.
What's interesting about AGI is that there isn't much debate about whether or not it will be achieved. Rather, developments in AI are moving at such pace that there seems to be broad consensus that an AGI breakthrough is likely in the near future. For example, revealing that one of the objectives of his company is the development of AGI, the CEO of DeepSeek, Liang WenFeng, is confident that the technology will be achieved in our lifetime.
Meanwhile, the emergence of AGI is linked to speculation about it evolving into an AI superintelligence (ASI). ASI refers to an even more advanced intelligence that surpasses the cognitive abilities of the best human minds in virtually every field, including creativity, problem-solving and decision-making. As noted, its emergence is currently merely in the realm of speculation. But some AI researchers are indeed pondering whether we've finally sown the seeds of our own destruction by producing a technology that could potentially outsmart us…and decide that it doesn't need us anymore. Surmising that ASI will not need human supervision to emerge, science writer Anil Ananthaswamy contends, "The revolution will be self-supervised".
While we haven't quite entered the era of AGI and hopefully may never have to face off against ASI, these concerns have led to renewed interest in cognitive science and the philosophy of consciousness. This is where Chomsky enters the picture as a leading scholar in linguistics who's made a significant contribution to the field of cognitive science. His views have primarily been sought as a linguist who studies the biological basis of language, with his work attracting renewed attention because the technology underpinning LLMs is inspired by the human brain.
Chomsky's Critique of LLMs
In November 2022, coinciding with the launch of ChatGPT, Chomsky participated in a panel discussion at Web Summit, an important annual technology conference. In a discussion about the intelligence of ChatGPT, Chomsky argued that LLMs do not understand language. In fact, he was somewhat scathing of ChatGPT, arguing that it had the intelligence of a two-year-old. Chomsky reiterated his views on LLMs in a co-authored New York Times op-ed that was published in March 2023.
One needs to understand the technology under the hood to make sense of Chomsky's critique of LLMs. ChatGPT is based on a machine learning model that simply predicts the next word based on the fact that it has learned to memorize and copy patterns seen during its training phase. In this regard, Gary Marcus, a psychologist and cognitive scientist who shared the Web Summit platform with Chomsky and holds similar views, argued that ChatGPT is "autocomplete on steroids". Chomsky is also known to have referred to ChatGPT as "high-tech plagiarism".
Another concern highlighted by Chomsky at the summit is that LLMs, such as ChatGPT, are simply too strong. Thus, highlighting the overconfidence problem of LLMs, which poses a risk because these AI systems can be very convincing even when the information they provide is unreliable. In this respect, LLMs are excellent at generating fluent, coherent text, but unreliable as standalone truth-tellers. As one Microsoft data scientist put it, LLMs learn "only to generate text, not factually true text".
Overall, Chomsky's critique of LLMs is that they have a shallow understanding of language compared to humans, which hinders their ability to reason. The thrust of his argument is that while these models can generate text and mimic language patterns, they lack the cognitive models and abstract reasoning that humans possess. At the heart of his critique is the distinction between syntactic and semantic language, i.e., the structure of language versus its meaning.
Chomsky's critique is based on the design of machine learning LLMs. LLMs do indeed process language in a way that highlights the distinction between structure (syntax) and meaning (semantics). In this respect, they focus on structure and not true meaning, as they are trained to predict the next word in a sequence by analysing massive datasets of human text, which they are very good at.
In fact, LLMs are excellent at recognizing and reproducing patterns in the structure and grammar of language, sometimes even finding patterns that humans miss. In this regard, referring to them as "correlation machines," Ananthaswamy, author of a fascinating book on the mathematics behind machine learning models, argues, "Machine learning models are essentially very, very, very sophisticated pattern matching machines." His view on machine learning models, while positively framed, is not dissimilar to Marcus' critique that LLMs are autocomplete on steroids. Overall, however, this pattern matching thesis supports Chomsky's assertion that LLMs are not engaging in abstract reasoning.
Mapping Chomsky's Theory of Universal Grammar onto His Critique of LLMs
To make sense of Chomsky's views, we must draw on his theory of universal grammar, a major contribution to cognitive science that was codified more than half a century ago. In line with the biolinguistics tradition, Chomsky believes that humans are born with the innate ability to learn language because our brains are hardwired for it. Building on this core biolinguistics theory, Chomsky's contribution to the field is that all humans also share a common basic structure for language (a syntactic structure) that is innate. In other words, beyond just an innate ability to learn language, Chomsky argues that all humans also share the same internal language blueprint.
According to this view, humans are born with a set of principles governing how words and sentences are organised—what Chomsky refers to as a "universal grammar". This means that even if the ordering of words is different, all languages have ways to combine subjects, verbs and objects. So, whether one speaks Mandarin, English or anything else, the fundamental way that our brains process and structure language is the same.
It is this instinctive capacity for language that engenders meaning for humans because semantics develops as a secondary layer on top of this universal grammar. How it works is that syntax organises language, while meaning fits into the framework provided by the syntactic structure. In this regard, Chomsky's syntax-centric approach emphasizes that meaning emerges from structured word relationships. Put differently, meaning is derived from the relationship between words.
However, humans also possess another quality that machine learning models lack. This is a uniquely human internal model of the world that facilitates the construction of meaning. Thus, the nub of Chomsky's argument is that LLMs fail to understand language properly, because they have merely figured out the statistical probabilities of the structure of language without the innate capacity for language that humans possess, which engenders meaning. From this vantage point, LLMs simply do not understand the meaning of the text that they generate.
The Human Mind as a Computer: Symbolism vs Connectionism
Chomsky's main critique of LLMs is that they make no significant contribution to cognitive science because LLMs tell us nothing about how the human mind actually works. In this regard, the idea that the brain performs computations similar to AI is a widely debated topic in neuroscience, cognitive science and AI research that goes back decades.
However, the computational theory of mind—the idea that the mind functions like a computer—does actually resonate with Chomskian linguistics and his theory of a syntactic structure in the brain performing some kind of algorithmic computation. However, what these computations are, remains an unsolved mystery, and in this regard, the central debate in cognitive science and AI is the question of whether human intelligence/reasoning aligns more with symbolic AI (rule-based systems) or connectionism (neural networks). In other words, is the mind a symbolic processor or a neural network?
Symbolic AI relies on explicit rules, symbols and logic to represent knowledge. For example, if X then Y. Symbolic AI is mainly influenced by mathematics, logic and psychology. It is associated with institutions like MIT and Stanford University and researchers like Chomsky, Marcus and Stephen Pinker, the well-known cognitive psychologist who is an advocate of computational theory of mind, and whose work is influenced by Chomskian linguistics.
Connectionist systems learn from data, forming distributed representations through statistical patterns. In connectionism, there are no explicit rules programmed, and knowledge emerges from training. Connectionism is considered a "black box" because its decisions emerge from complex interactions between millions of artificial neurons across layered networks. This makes it difficult to trace how or why specific inputs lead to outputs, prompting critics like Chomsky to argue that LLMs make no significant contribution to cognitive science. In this respect, Chomsky insists that language requires innate, biologically endowed structures (universal grammar), while dismissing connectionism as pre-scientific.
Refuting Chomsky's Views on LLMs
The world's leading connectionist, Geoffrey Hinton, winner of the 2024 Nobel Prize in physics for his pioneering work in AI, specifically in artificial neural networks, advances an argument that refutes Chomsky's critique, viz., that LLMs do not understand language and that they make no contribution to understanding cognitive science.
Also known as the godfather of AI, Hinton's theory is that as AI has evolved over the years, it has also become increasingly evident that there are some parallels between the brain and artificial neural networks.
For Hinton, whether or not artificial neural networks and language models can understand what they generate depends on "where you set the bar for what constitutes a semantic understanding of language". In this respect, he completely disagrees with the Chomskian view on LLMs, arguing that these AI systems do understand what they are saying. In a February 2024 lecture about whether digital intelligence will replace biological intelligence, Hinton unequivocally stated, "I'm making the very strong claim these things really do understand".
Hinton specifically addresses the "just pattern matching" criticism by clarifying how neural networks work. He explains that modern neural networks convert words into rich feature representations, which capture complex aspects of meaning. According to Hinton, it is the interactions between features that allows for genuine reasoning and understanding. In his lecture, he demonstrates how understanding comes from converting words into features that interact with each other in complex ways to generate meaningful responses.
Features are learned representations of meaning that neural networks discover. LLMs use "millions of features and billions of interactions between features". However, Hinton drew on a simple 1985 model to explain how features interact to generate meaning. In this old model, words were converted into vectors of just six features. Despite its limited scope, this model learned useful features such as different nationalities (English vs Italian) and different generations (parent and child), which were not pre-programmed, but emerged from the learning process.
For Hinton, the ability to generate plausible new combinations is evidence of understanding in LLMs. From this vantage point, interactions between features allow the model to generate new knowledge rather than just retrieve stored patterns. Thus, for Hinton, LLMs are not just storing sequences, but learning generalizable knowledge because features aren't just statistical patterns, but a rich representational system that can support genuine understanding and reasoning. This directly challenges both Chomsky and Marcus' views by showing that neural networks aren't just doing superficial pattern matching.
Also, in contrast to Chomsky, Hinton views the mechanism that LLMs use for understanding as similar to human understanding. He argues that the mechanism for understanding in LLMs is neither different from human understanding, nor using a lesser form of processing. In fact, he contends that LLMs' way of understanding—through features and their interactions—is currently our best model to grasp how understanding works in the human mind. The basis of his argument is that the distinction between "real" understanding and "just pattern matching" is false because both human and machine understanding involve feature-based processing.
However, to be fair to Chomsky whose critique specifically focuses on higher order abstract reasoning, the connectionist model of thinking is in fact associated with lower order subconscious processing like perception and motor control in the human brain.
Symbolism vs Connectionism and the Road to AGI
We need to bear in mind that Chomsky's analysis is two years old and was made at a time when we'd only just been introduced to OpenAI's ChatGPT-3.5. Since then, OpenAI and its competitors have released significantly advanced versions, which include reasoning models. Of course, the fact that we're witnessing the release of reasoning models does cast doubt on Chomsky's assertion that LLMs lack the ability to reason. However, while his ideas continue to inform the symbolic AI approach, and while his criticisms of connectionism have shaped many of the arguments in favour of symbolic reasoning with respect to AI technologies, Chomsky is not and has not been directly involved in the formal debate between symbolism and connectionism in AI. His primary focus has always been linguistics and cognitive science. In this respect, his views are rooted in classical symbolic AI, so-called "good old fashioned AI" (GFOAI), which was more focused on unravelling the mysteries of the mind.
Meanwhile, while historically focused on the human mind, the debate between symbolism and connectionism has entered a new domain in recent years, i.e., it has now been transposed onto the big debate of our time regarding technological advances in AI. In this regard, the debate between symbolism and connectionism now centres on which approach is best suited to achieving true artificial intelligence or AGI. To this end, there’s been a longstanding public debate between Hinton (connectionism) and Marcus (symbolic AI).
However, Pedro Domingos, emeritus professor of computer science and engineering at the University of Washington, contends that the binary standoff between symbolism and connectionism is unproductive. He posits that AGI will not emerge from a single AI paradigm but through the synthesis of multiple approaches, particularly the integration of symbolic AI with connectionism and probabilistic methods. In this regard, he argues that AGI progress hinges on paradigm collaboration, not dominance. Indeed, he argues that LLMs are increasingly augmented with symbolic tools to enhance reliability. For example, Domingos notes that while companies like OpenAI may not explicitly call their techniques "symbolic AI", what they are doing is incorporating symbolic reasoning methods into their LLMs to improve tasks.
In this regard, Domingos envisions AGI not as a sudden breakthrough but as an evolutionary synthesis of existing paradigms. By combining symbolic AI's reasoning, neural networks' adaptability, and probabilistic methods' handling of uncertainty, AGI will gradually achieve human-like versatility. Current systems like AlphaGo, GPT-4, and advanced robotics exemplify early steps toward this synthesis, but true AGI will require deeper, more seamless integration—a challenge Domingos believes the field is steadily addressing.
Thus, the future of AGI will likely involve hybrid systems that combine the strengths of both approaches, allowing for systems that can perceive, reason, and adapt in ways that are more similar to human intelligence. This synergy between symbolic and connectionist approaches is seen as a promising direction for building AGI, as it will enable machines to perform a wider range of tasks with greater flexibility, reasoning ability and adaptability.
While the path to AGI may emerge through this synthesis of competing paradigms rather than through either approach alone, we find ourselves at a fascinating crossroads—one that was intellectually paved decades ago by pioneers like Chomsky. His theories of language and cognition didn't just revolutionize linguistics; they established foundational frameworks that continue to shape how we understand both human and artificial intelligence. As we venture further into this new technological frontier, Chomsky's intellectual legacy reminds us that the most profound questions about intelligence—whether human or artificial—are ultimately questions about the nature of the mind itself. |