GPT-4 is here: lots more monkeys, still no Shakespeare

GPT-4 GPT-3 general AI monkeys
Image by Tawng | Bigstockphoto

Open AI has launched GPT-4 upon the world. But while it records steady improvements in performance, it refuses to disclose how many parameters or compute it took to create, raising the possibility that this game is becoming so expensive to play that it will never be commercially viable. And we’re still no closer to general AI than we were with GPT-3.

GPT-4 is the latest version of OpenAI’s generative foundation model. It varies from GPT-3 in that it is larger (possibly 100 trillion parameters), and there was human intervention in its training.

100 trillion (if correct) is a staggering increase in size – it’s 571x the size of GPT-3, in fact. And frankly, I am surprised that OpenAI was able to find that much data in existence. That also means GPT-4 will be vastly more expensive to run in terms of electricity, and also to build, requiring many more processors to run and memory to store it.

OpenAI is refusing to disclose anything regarding the architecture, size, hardware, training compute, dataset construction or the training method. It says it’s doing this for competitive reasons (see here, section 2 first paragraph), for which there is an argument, given the degree to which Microsoft has panicked Google. ln any case, I suspect the resources that were consumed to create, train and run GPT-4 were exponentially greater than for GPT-3.

GPT-4 is demonstrably better than GPT-3 at taking standardized tests like GREs, SATs and the legal bar exam, going from scoring in the 10th percentile to the 90th.  But the system still has a tendency to hallucinate.

GPT-4 has same limitations as GPT-3

In fact, GPT-4 has all of the same limitations that are inherent in GPT-3. Put simply, in terms of making GPT more aware and more causal in its understanding, there has been no progress at all.

This is fundamental because causal understanding is the central limitation of all systems based on deep learning, as these systems reason by statistical correlation, not by causal understanding. This is what leads to the errors, irrationality, craziness and hallucinations that many people have reported. Unless something fundamentally changes in how these models are built, these problems will persist.

GPT-4 is also unusual in that it had human intervention during its creation. This was done in an attempt to pre-empt bad actors from trying to entice the system into saying socially unacceptable things.

The problem here is that GPT-4 has now had bias injected into it, as views on what is socially acceptable vary widely. Si ironically, the very bias that OpenAI claims to try and eradicate is now part of the system by design.

GPT-4 has now also been trained with vision and can describe photographs as well as what is odd or strange about them. This represents a sort of generalization, as language and vision are currently implemented using two different types of neural network, but here OpenAI claims to have managed this with just one.

Whether GPT-4 can perform as well as other generative AIs that are specifically designed for generating or recognizing images remains to be seen. For example, Midjourney – which was specifically designed for images from the ground up – is much better than DALL-E, which is based on the large language model, GPT-3.

We are no closer to general AI

The net result is that I think GPT-4 is a triumph of effort over finesse, and reinforces my view that OpenAI’s philosophical approach to AI remains that with infinite data and infinite compute power, general AI will magically appear.

I see this as an iteration of the infinite monkey theorem which states that if you have enough monkeys, typewriters and time, they will eventually produce the works of William Shakespeare. Unfortunately, despite a massive increase in monkeys, there is no sign of any of the famous plays.

This also raises the likelihood that we are fast reaching the point of diminishing returns in deep learning where more and more effort is required to produce smaller and smaller improvements.

The improvement of GPT-4 over GPT-3 is less than GPT-3 over GPT-2, but it’s likely that a much greater increase in resources was required to produce it. Furthermore, there’s nothing in GPT-4 that leads me to think that general AI is any closer, leaving the industry in need of other techniques to solve some of the more difficult problems of AI, like autonomous driving.

Here, I continue to think that a combination of rules-based software and small specific neural networks, each carrying out a small simple task, is the way that these problems will be solved in a practical way. There has been some progress on this front, but this approach still remains some way away from a practical and commercial application.

In the meantime, OpenAI and others will continue to fuel the hype and expectations of general AI until such time that expectations are not met. This will result in disappointment, disillusionment, falling investment and lower valuations as it has on three separate occasions in the last 70 years. In short, the 4th AI winter.

Be the first to comment

What do you think?

This site uses Akismet to reduce spam. Learn how your comment data is processed.