Sam Altman, the charismatic head of OpenAI, would have drawn the ire of the board on the basis that the manager had withheld from him that the ChatGPT algorithm would be able to do math. The rumor hasn’t been confirmed, but it’s an opportunity to remember that to this day, all AIs stumble over math.
What if it was a math error? Since November 22, the discipline has found itself at the heart of the psychodrama surrounding the true-false firing of Sam Altman, CEO of OpenAI, the company that developed the ChatGPT software.
The ousted boss was reinstated on Tuesday, November 21, following a dispute between some staff and the board. The latter had to give up after thanking Mr. ChatGPT the week before without giving an exact explanation.
Mystery Question*
In fact, Sam Altman’s original firing was motivated in part by a letter sent by concerned employees, Reuters reported on Wednesday (November 22). They alerted board members to the prowess of an internal algorithm that would cross a threshold described as “disturbing” within a secret project called Q*. Which one? GPT would succeed in… solving basic math problems.
The existence of this letter, which is also mentioned by “The Information”, could not be officially confirmed, and Reuters admits that it has not seen the famous letter. Other media such as “The Verge” claim that these rumors of a math-doped ChatGPT are of little consequence, and that the board says it never received a letter warning of the risks of the Q* project.
If the subject matter of the report remains unclear, its content reveals an often-ignored reality: ChatGPT, and AI in general, have so far been terrible at math. Worse, according to researchers from Stanford University and California, the latest versions of the famous OpenAI conversational bot are more difficult to deal with certain mathematical problems than the previous ones.
This reality may seem surprising. Artificial intelligence, these calculating beasts, should a priori cut the math short. Several ChatGPT users have also been surprised by the tool’s limits since its launch in December 2022. “I asked him, ‘If five machines produce five objects in five minutes, how long will it take 100 machines to produce 100 items.’ And he told me 100 minutes. That’s not true!” laments a netizen on an artificial intelligence forum. There are five answers to this problem – fundamental to a mathematician.
“Mathematics and computing are not exactly the same thing,” notes Vincent Corruble, a researcher at LIP 6 (Computer Science Laboratory in Paris 6). Just because a model like ChatGPT impresses with its ability to chat with ordinary people doesn’t mean it can also “understand” math like a human. “For example, it is easy for us to recognize a triangle in an image, while it is much more difficult for a machine,” explains Nicolas Sabouret, professor of computer science and specialist in artificial intelligence at Paris-Saclay University.
Mathematics is not (only) calculation
The main reason she’s allergic to GPT math—which stands for “Generative pre-trained Transformer” or “Transformer generative pre-trained” in French—is that she wasn’t programmed to do it. “It was designed to generate language by choosing which word is next in a given sentence,” explains Tom Lenaerts, professor at the Free University of Brussels and president of the Benelux Association for Artificial Intelligence.
In other words, ChatGPT and other large language models (LLMs) use their computing power to create probabilities and decide which words to choose from their huge database—in ChatGPT’s case, the entire web—to construct a seemingly meaningful sentence.
This statistical approach also allows these “bots” to do mathematics like Monsieur Jourdain in prose: without knowing it. They will have no problem answering that “2 + 2” equals 4. But just by delving into their Ali Baba’s cave of data and writings collected online, these algorithms will discover that the most likely sequence of the series of numbers and characters is “2” + 2 = is the number 4.
This does not mean that these AIs “understand” the logic of this addition. “Mathematics is about learning to think, and artificial intelligence is just calculation. It is very difficult to reproduce logical reasoning with just calculation. At best, these machines will manage to imitate it,” explains Nicolas Sabouret.
And this is not a new problem. Logical Theorist, one of the very first artificial intelligence software developed in the 1950s, “was programmed to think like a mathematician,” points out Vincent Corruble. This model, which did not try to generate text, but rather to think like a human, “quickly showed its limits”, specifies the LIP 6 expert.
Over time, mathematics has become one of the last bastions still resisting AI. As “artificial intelligence has succeeded in replacing humans in games such as chess and Go, mastery of mathematics seems to be one of the main hurdles to overcome,” believes Vincent Corruble.
But “without significant new developments in the way large language models are designed, they won’t be able to do math because they don’t think,” says Tom Lenaerts.
The spectrum of artificial general intelligence
Hence the doubts expressed by experts interviewed by France 24 about rumors of Q* and its AI capable of mathematical reasoning. “A language model is a machine that works on the basis of probability and does not take into account the notions of ‘true’ or ‘false’ which are central to mathematics,” explains Nicolas Sabouret.
If a large language model manages to cross this threshold, “it would be quite a theoretical breakthrough,” Andrew Rogoyski, an artificial intelligence expert at the University of Surrey, assures the Guardian.
“Reuters” and “The Information”, both of which cite the existence of the letter, confirm that the authors of the letter find Q*’s ability “disturbing”. Indeed, the ability to do math would bring the algorithm closer to “artificial general intelligence,” Reuters writes.
This concept of super-AI refers to “artificial intelligence that would be able to solve any problem,” explains Nicolas Sabouret. It is also “a comparison of a machine with a human being,” adds Vincent Corruble. In other words, every time these algorithms encroach on what has until now been humanity’s personal territory, the specter of AGI becomes a little clearer.
In this sense, “knowing how to manipulate mathematical reasoning would really be a step towards general artificial intelligence,” affirms Tom Lenaerts. Enough to give some people a cold sweat, especially within OpenIA. Sam Altman’s company defines this “superintelligence” as “an autonomous system with the ability to outperform humans in almost all economically useful tasks.”
But for Nicolas Sabouret, we are still a long way from that. Even if AI could count, “that would not give it more autonomy or make it more dangerous,” and therefore would not be able to replace humans.