The Five A's of AI - Chapter 3
Intelligence Explosion: From AlphaGo to ChatGPT in Just Seven Years
How AI Went From Beating Games to Transforming Business Overnight
By Owen Tribe, author of "The Five A's of AI" and strategic technology adviser with 20+ years delivering technology solutions across a range of industries
Chapter Highlights
AlphaGo's 2016 victory proved machines could master intuition-based tasks
Deep learning breakthrough in 2012 enabled pattern recognition at scale
ChatGPT reached 100 million users in 2 months - fastest adoption ever
DeepSeek achieved GPT-4 performance at 95% less cost in 2024
Understanding acceleration helps predict and prepare for change

Chapter 1 - The Dream of Thinking Machines (1830’s-1970’s)
Chapter 2 - Digital Revolution (1980’s-2010)
Chapter 3 - Intelligence Explosion
Chapter 4 - AI Paralysis
Chapter 5 - The Five A's Framework
Chapter 6 - Automation Intelligence
Chapter 7 - Augmented Intelligence
Chapter 8 - Algorithmic Intelligence
Chapter 9 - Agentic Intelligence
Chapter 10 - Artificial Intelligence
Chapter 11 - Governance Across the Five A's
Chapter 12 - Strategic Implementation
Chapter 13 - Use Cases Across Industries
Chapter 14 - The Future of AI
Understanding the Intelligence Explosion
What Is The Intelligence Explosion?
The Intelligence Explosion describes the sudden, accelerating advancement in AI capabilities from 2012 onwards, transforming AI from research curiosity to business necessity in under a decade.
The Explosion Pattern
The acceleration shows clear phases:
-
2012-2016 Research breakthrough - Deep learning works
-
2016-2019 Proof points - Superhuman performance demonstrated
-
2019-2022 Commercialisation - Business applications emerge
-
2022-2024 Democratisation - AI for everyone
-
2024+ Commoditisation - AI becomes infrastructure
Whilst You Watched
-
AlphaGo - defeated world champions
-
GPT evolved - from curiosity to capability
-
Costs plummeted - by orders of magnitude
-
Access democratised - from labs to laptops
-
Industries transformed - from automation to intelligence
The Research: Explosion Dynamics
1. The AlphaGo Moment
March 2016: AI defeated Lee Sedol at Go, achieving what experts thought impossible until 2030.
Translation: AlphaGo proved machines could handle tasks requiring intuition, creativity, and strategic thinking—not just calculation. This shattered assumptions about AI limitations.
2. The Acceleration Metrics
AI capability improvement rates:
Metric | 2012 | 2016 | 2020 | 2024 | Improvement |
|---|---|---|---|---|---|
Energy Efficiency | Baseline | 2x | 20x | 200x | 200x better |
Model Size | 10M | 1B | 175B | 1.7T | 170,000x larger |
Inference Speed | 1x | 10x | 100x | 1000x | 1000x faster |
Training Cost (GPT-level) | N/A | $100M | $10M | $500K | 200x cheaper |
Language Understanding | 60% | 75% | 90% | 98% | 63% gain |
Image Recognition Error | 25% | 5% | 1% | 0.1% | 250x better |
3. The ChatGPT Phenomenon
November 2022 changed everything:
-
2 months - 100 million users (fastest ever)
-
$0 - Cost to access advanced AI
-
175 billion - Parameters in GPT-3
-
90% - Businesses exploring AI within 12 months
-
67% - Executives paralysed by options
Chapter 3
How Big Data and Deep Learning Triggered the AI Revolution
The Four Seasons Hotel in Seoul thrummed with nervous energy on 9 March 2016. In a conference room transformed into a makeshift television studio, two figures sat opposite each other across a Go board. On one side, Lee Sedol, the South Korean master who had dominated the game for a decade, winner of 18 world titles. On the other, Aja Huang, a DeepMind team member placing stones on behalf of an artificial intelligence called AlphaGo.
The match was scheduled to run for five games between 9 and 15 March 2016, with each game starting at 13:00 KST (04:00 GMT). Over 200 million people worldwide would eventually watch as these games unfolded, witnessing something that experts had believed was still a decade away. The atmosphere was electric, charged with the sense that history was being made. Korean television networks had cleared their schedules. Commentators spoke in hushed tones, as if witnessing a sacred ritual.
Go had long stood as the Everest of artificial intelligence challenges. The game is a googol times more complex than chess, with an astonishing 10 to the power of 170 possible board configurations. That's more than the number of atoms in the known universe. Unlike chess, where brute force calculation had conquered human champions in 1997, Go required something more profound: intuition, pattern recognition, the ability to evaluate positions through feeling rather than calculation. In Asia, Go wasn't merely a game. It was philosophy made tangible, a mirror of the human mind itself.
Lee had initially predicted he would defeat AlphaGo in a "landslide". His confidence wasn't misplaced. Prior to 2015, the best Go programs only managed to reach amateur dan level, struggling against even moderately skilled human players. Just five months earlier, in October 2015, AlphaGo had played European champion Fan Hui in a closed doors match at DeepMind's London offices. The AI had won 5-0, but Fan Hui was ranked 2-dan professional, respectable but far from elite. Lee Sedol played at 9-dan, the game's highest rank. The gap between them was like comparing a talented club footballer to Lionel Messi.
As the first game began, Lee appeared relaxed, even jovial. He chatted with officials, smiled for photographers, approached the board with the casual confidence of someone who had played thousands of games. But as the hours wore on, something shifted. AlphaGo wasn't playing like a computer. Its moves had an almost human quality: creative, unexpected, sometimes seemingly illogical until their purpose became clear moves later. Lee's expression grew increasingly serious. His famous confidence began to waver. By the time Lee resigned the first game after 186 moves, the atmosphere in the room had changed. This wasn't going to be the walkover everyone expected.
Then came game two, and with it, Move 37.
The placement of a single black stone on the fifth line, far from the edges where conventional Go wisdom suggested stones should be played in the opening. The move had a 1 in 10,000 chance of being used according to conventional play. In the commentary room, Michael Redmond, one of the few Western 9-dan professionals, fell silent. Fan Hui, now serving as an advisor to the DeepMind team, looked stunned. Even AlphaGo's creators hadn't anticipated this. The DeepMind team watching from London held their breath. Was this brilliance or a catastrophic bug?
The move violated centuries of accumulated human knowledge about Go. Professional players watching around the world thought it must be a mistake. Some assumed it was a bug in the program. Online forums erupted with debate. In Go schools across Korea and Japan, masters paused their lessons to discuss this unprecedented move. But as the game progressed, the genius of Move 37 revealed itself. It wasn't just unconventional; it was transformative, creating influence across the board in ways that human professionals had never conceived. The stone seemed to radiate power, controlling vast swathes of the board through its unconventional positioning.
This pivotal and creative move helped AlphaGo win the game and upended centuries of traditional wisdom. In that moment, something fundamental shifted. This wasn't a machine following human programmed strategies. This was genuine creativity, perhaps even understanding, emerging from silicon and code. Lee Sedol, visibly shaken, left the playing room during the game, something almost unheard of at this level of play. When he returned, his demeanour had changed. The unthinkable was happening.
But what exactly was happening inside AlphaGo's silicon brain? To understand this moment and why it matters, we need to grasp what neural networks actually are.Modern neural networks operate as sophisticated pattern matching systems built from mathematical functions. Unlike biological neurons, which are complex electrochemical factories with substantial internal structure, artificial neural networks consist of deterministic calculations performed by binary switches. Each artificial 'neuron' is simply a mathematical function that processes numerical inputs and produces numerical outputs according to predetermined rules.
Instead of biological neurons, we have mathematical functions and instead of synapses, we have numerical weights that determine how strongly one artificial neuron influences another. When you show a neural network an image, say, a photo of a cat, it's converted into numbers representing pixel values. These numbers flow through the network, each layer transforming them, until the final layer produces an output: "cat" with 97% confidence.
The training process involves statistical optimisation. Initially, the network's parameters are set randomly. When the system incorrectly classifies an image, mathematical algorithms adjust these parameters to reduce similar errors. This process resembles statistical curve fitting more than biological learning. Show it millions of images, adjust the weights millions of times, and gradually the network learns to recognise patterns. First, simple edges and corners. Then shapes. Then textures. Finally, complex concepts like "cat" or "dog" or "car."
The network identifies statistical patterns in training data through mathematical optimisation. Whilst it appears to 'discover' features like edges or textures, it is actually finding mathematical correlations that minimise prediction errors. These patterns reflect the statistical structure of training data rather than genuine understanding.
In the middle layers of a trained network, researchers find artificial neurons that have spontaneously specialised. One fires strongly for curved edges, another for fur textures, another for eye-like patterns. The network has developed its own internal representation of the visual world.
This is fundamentally different from traditional programming. When early AI researchers tried to create computer vision systems, they would painstakingly program rules: "If you see two circles above a triangle, it might be a face." But the real world is too complex, too variable for explicit rules. A face can be any colour, any angle, partially hidden, oddly lit. Neural networks bypass this problem by learning their own rules from examples.
Lee Sedol would win one game, game four, with his own moment of brilliance. In a move that would become legendary, Lee played Move 78, which also had a 1 in 10,000 chance of being played. Known as "God's Touch", this move was just as unlikely and inventive as the one AlphaGo had played two games earlier. For a brief moment, it seemed humanity might fight back. The Korean audience erupted in celebration. Lee smiled for the first time in days. But it was a lone victory in a lost war. The final score stood at 4-1 to AlphaGo. The machine had earned a 9-dan professional ranking, the first time a computer Go player had received the highest possible certification.
Demis Hassabis, watching from London, later reflected on the significance: "This is a lighthouse project, our first major investment in terms of people and resources into a fundamental, very important, real-world scientific problem." If AI could master Go, with its requirements for intuition, creativity, and long-term strategic thinking, what else might be possible? The implications rippled far beyond the Go community, sending shockwaves through Silicon Valley, government corridors, and research laboratories worldwide.
AlphaGo's victory didn't emerge from a vacuum. It represented the convergence of three technological streams that had been building for decades: massive computational power, vast quantities of data, and breakthrough algorithmic techniques. Understanding this convergence is crucial to grasping why AI exploded when it did, and why the revolution continues to accelerate.
The story begins not in 2016, but in 2012, with a seemingly obscure academic competition that would reshape the technology industry. The ImageNet Large Scale Visual Recognition Challenge had run since 2010, challenging researchers to build systems that could correctly identify objects in photographs. The scale was staggering. Contestants had to classify images into 1,000 different categories, from specific dog breeds to types of vehicles to household objects.
The challenge represented everything that made computer vision difficult: the infinite variety of angles, lighting conditions, partial occlusions, and contexts in which objects might appear.
Year after year, progress had been incremental, almost painfully slow. The best systems in 2010 and 2011 achieved error rates hovering around 26%. Machine learning researchers were beginning to wonder if they'd hit a fundamental limit. Perhaps computer vision was simply too hard for current approaches. Graduate students would spend months tweaking algorithms for percentage point improvements. Professors would dedicate entire careers to marginally better edge detection or feature extraction. The field felt stuck.
Then, on 30 September 2012, everything changed. A team called SuperVision submitted their entry to the ImageNet challenge. The team consisted of Alex Krizhevsky, Ilya Sutskever, and their supervisor, Geoffrey Hinton, a name that would become legendary in AI circles. When the results were announced, the machine learning community was stunned. Their system, which they called AlexNet, achieved a top-5 error rate of 15.3%, more than 10.8 percentage points better than the runner-up.
This wasn't a marginal improvement. This was a quantum leap, the kind of breakthrough that happens perhaps once in a generation. To put it in perspective, it was as if someone had suddenly run a three minute mile, shattering not just the record but everyone's conception of what was possible. The machine learning community gathered at conferences spoke of nothing else. How had they done it?
The architecture that achieved this breakthrough was, by today's standards, relatively modest. The neural network had 60 million parameters and 500,000 neurons, consisting of five convolutional layers, some followed by max-pooling layers, and two globally connected layers with a final 1000-way softmax. But it incorporated several crucial innovations that would become standard in deep learning. It used the non-saturating ReLU activation function, which trained better than traditional tanh and sigmoid functions. This seemingly simple change, using a rectified linear unit instead of traditional activation functions, dramatically improved training efficiency and addressed the vanishing gradient problem that had plagued deep networks.
But perhaps the most crucial innovation was where and how it was trained. AlexNet was trained on two Nvidia GTX 580 GPUs in Krizhevsky's bedroom at his parents' house. That detail, training in a bedroom on gaming GPUs, illustrates something profound about the democratisation of AI that was beginning. You didn't need a supercomputer or a corporate research lab. You needed insight, determination, and access to the right tools. The image of Krizhevsky optimising neural networks while his parents watched television downstairs would become part of AI folklore.
To understand why GPUs were so transformative, imagine you're tasked with painting a massive mural. You could use a single, highly skilled artist (like a CPU, a central processing unit) who paints each detail sequentially with perfect precision. Or you could employ hundreds of art students (like a GPU, a graphics processing unit) who might be less sophisticated individually but can each paint a small section simultaneously. For certain tasks, like filling in large areas of colour, the army of students will finish far faster than the master artist.
CPUs are designed for sequential processing. They're brilliant at complex tasks that must be done in order. But neural networks require millions of simple calculations: multiply this number by that weight, add them up, apply a function. These calculations can largely be done in parallel. GPUs, originally designed to calculate how light bounces off millions of polygons in video games, turned out to be perfect for this. A single GPU might have thousands of simple processing cores compared to a CPU's handful of complex cores.
The graphics processing units that powered AlexNet represented a fundamental shift in how we compute. During the 2000s, as GPU hardware improved to satisfy gamers' insatiable appetite for better graphics, some researchers began to realise these chips could be repurposed. The improvements were staggering. A deep convolutional neural network trained on GPU could be 60 times faster than an equivalent CPU implementation. Calculations that would have taken months now took days. Experiments that were previously impossible became routine.
Nvidia, led by CEO Jensen Huang, had made a prescient bet on making GPUs more programmable through their CUDA platform, released in 2007. Huang, who had co-founded Nvidia in 1993 with a vision of accelerated computing, saw beyond gaming. He imagined a world where parallel processing would transform scientific computing, though even he couldn't have predicted quite how transformative it would prove. Both ImageNet and CUDA were, like neural networks themselves, fairly niche developments waiting for the right circumstances to shine. Their convergence would prove transformative.
The impact on Nvidia was immediate and profound. The company's stock price began a climb that would see it more than quadruple over the next five years. By June 2024, Nvidia's market capitalisation would exceed $3.3 trillion, briefly making it the world's most valuable company. Gaming GPUs, once dismissed by serious computer scientists as toys, had become the engines of the AI revolution.
But powerful hardware and clever algorithms weren't enough. Neural networks are hungry beasts, requiring vast amounts of data to learn effectively. This is where the second crucial development comes in: the explosive growth of digital data in the 2000s and 2010s. The internet had become a vast repository of human knowledge and behaviour, but it was largely unstructured, chaotic, impossible for traditional algorithms to parse meaningfully.
Enter Fei-Fei Li, a Stanford professor with a vision that seemed almost quixotic when she started. Li understood that computer vision needed data at unprecedented scale. Not hundreds or thousands of labelled images, but millions. She embarked on creating ImageNet, ultimately building a dataset containing over 14 million labelled images across 22,000 categories. The project started in 2009, when most researchers thought she was wasting her time. Why would you need millions of images? Surely a few thousand carefully curated examples would suffice?
Li's insight was that algorithms weren't the bottleneck. Data was. She often spoke of her inspiration coming from her young daughter learning to see. A child doesn't learn what a cat is from a few examples but from thousands of encounters with cats in different contexts, positions, and lighting conditions. Machine learning, she reasoned, needed the same richness of experience. Initially met with scepticism and struggled to secure funding, ImageNet later became the foundation of the deep learning revolution.
The creation of ImageNet itself tells a story about the changing nature of work in the digital age. The images were labelled using Amazon Mechanical Turk, a platform where human workers perform small tasks for tiny payments. Thousands of workers around the world, many in developing countries, carefully labelled millions of photographs. They would look at an image, identify objects within it, and type labels. A few cents per image. This human labour, often invisible in discussions of AI breakthroughs, provided the foundation for machine intelligence. Behind every intelligent machine were thousands of humans performing repetitive tasks.
But ImageNet was just one example of a broader phenomenon. By 2012, humanity was generating data at an unprecedented rate. Every minute, Facebook users shared 684,000 pieces of content. Google processed over 2 million search queries. YouTube users uploaded 48 hours of video. Smartphones recorded location data, shopping habits, communication patterns. The internet of things promised to extend this data collection to every device, every interaction, every moment of human life. We were drowning in data, but traditional computing methods could barely wade in the shallows.
Social media platforms had become particularly rich sources of data. Every like, share, and comment provided insight into human preferences and behaviour. Every uploaded photo added to the visual corpus of human experience. Every status update revealed something about human language and emotion. This wasn't just data. It was human experience digitised at unprecedented scale.
The third element of the perfect storm was algorithmic, specifically, the resurrection and vindication of neural networks, rebranded as "deep learning." This rebranding wasn't mere marketing; it signalled a fundamental shift in approach. Where previous neural networks had been shallow, with perhaps two or three layers, deep learning involved networks with dozens or even hundreds of layers, capable of learning increasingly abstract representations.
To understand why depth matters, think about how you recognise a face. You don't consciously catalog features: "nose 2 inches long, eyes 1 inch apart." Instead, your brain builds up understanding in layers. First, it detects edges and contrasts. Then it combines these into simple shapes. Then into features like eyes and noses. Finally, it integrates everything into the concept "face" and even "Mum's face" or "stranger's face." Deep neural networks mirror this hierarchical processing. Each layer learns increasingly complex features by building on what previous layers discovered.
Neural networks had endured a long, bitter winter. After initial enthusiasm in the 1980s and early 1990s, they'd fallen from favour. The reasons were practical: they were difficult to train, required too much data and computation, and often didn't perform as well as simpler methods. The 1969 book "Perceptrons" by Minsky and Papert had mathematically proven the limitations of simple neural networks, seeming to doom the entire approach. Funding dried up. Researchers moved to other fields. Neural network papers were rejected from conferences. It was career suicide for young researchers to work on them.
But a small group of researchers kept faith. Geoffrey Hinton in Toronto, who had been fascinated by neural networks since the 1970s, continued working despite the scepticism. Yann LeCun, who would later lead AI research at Facebook, persevered with convolutional neural networks. Yoshua Bengio in Montreal explored the mathematical foundations that would make deep learning possible. They were often mocked at conferences, their work dismissed as outdated. But they believed that neural networks, if made deep enough and trained with enough data, could achieve breakthrough performance.
These researchers weren't just stubborn; they had specific insights that suggested neural networks' time would come. Hinton had developed new training methods like backpropagation that made deeper networks feasible. LeCun had shown that convolutional architectures could efficiently process visual information. Bengio was developing the theoretical understanding of why depth mattered. They were building the intellectual infrastructure for a revolution, even as their colleagues dismissed them as relics of a failed approach.These researchers weren't just stubborn; they had specific insights that suggested neural networks' time would come. Hinton had developed new training methods like backpropagation that made deeper networks feasible. LeCun had shown that convolutional architectures could efficiently process visual information. Bengio was developing the theoretical understanding of why depth mattered. They were building the intellectual infrastructure for a revolution, even as their colleagues dismissed them as relics of a failed approach.
In one interview, Ilya Sutskever, then a graduate student working with Hinton, described the moment he realised what was possible. Looking at the trends in computational power and data availability, he had an absolute revelation: the combination of massive amounts of data and powerful compute would lead, he was sure, to unprecedented breakthroughs in AI. While others saw incremental progress, Sutskever saw an exponential curve about to go vertical. He was right.
The ImageNet moment proved them spectacularly right. It wasn't just the scale of the improvement that mattered, but what it represented. Deep learning wasn't just another technique; it was a fundamentally different approach to AI. Instead of hand crafting features and rules, you could let the network learn representations directly from data. Instead of telling the computer what to look for, you could let it discover what mattered. This was a philosophical shift as much as a technical one.
The aftermath of AlexNet was a gold rush. Every major tech company scrambled to hire deep learning experts. Graduate students who had struggled to find positions suddenly had multiple offers with astronomical salaries. Conferences that had rejected neural network papers now had entire tracks dedicated to deep learning. The field transformed almost overnight from backwater to centre stage.
While the ImageNet breakthrough was unfolding in North America, a different but complementary story was developing in London. DeepMind, founded in 2010 by Demis Hassabis, Shane Legg, and Mustafa Suleyman, represented a uniquely British approach to artificial intelligence. Ambitious in vision but grounded in neuroscience and careful engineering. Where American AI companies often focused on immediate applications, DeepMind took a longer view. Their mission was audacious: to "solve intelligence" and then use intelligence "to solve everything else."
Hassabis brought a unique perspective to AI research. Born to a Greek Cypriot father and a Chinese Singaporean mother in North London, he had been a chess prodigy, reaching master standard at 13. "I started playing when I was 4, after watching my father and an uncle," he recalls. "But they're not good chess players, so I was beating them within a couple of weeks." This early experience of mastery shaped his understanding of intelligence. He subsequently became lead programmer at the legendary UK video game developer Bullfrog Productions, creating complex artificial worlds before he'd finished his A-levels.
But Hassabis wasn't content with simulating intelligence; he wanted to understand it. He completed a degree in Computer Science at Cambridge with first-class honours, then surprised everyone by leaving the tech industry to pursue a PhD in cognitive neuroscience at University College London.
He wanted to understand how the human brain achieved general intelligence, believing this understanding was crucial for creating artificial intelligence. His thesis on memory and imagination would later influence DeepMind's approach to AI.
The company initially focused on training algorithms to master video games, which might seem trivial but represented something profound. In December 2013, they announced that they had trained an algorithm called a Deep Q-Network (DQN) to play Atari games at superhuman levels. The same algorithm could learn to play dozens of different games without being explicitly programmed for any of them. It learned purely through trial and error, like a child discovering how the world works.
The demonstration that caught Google's attention was revealing. A team of Google executives flew to London in January 2014 in a private jet for a secret meeting. Hassabis showed them a prototype AI that had learned to play the classic game Breakout. At first, the AI played poorly, missing the ball, scoring few points. But as it learned, something remarkable happened. It discovered, without being told, that the optimal strategy was to dig a tunnel through one side of the bricks and bounce the ball above them. This wasn't programmed; the AI had discovered strategy through experimentation.
Google acquired DeepMind for around $500 million (the exact figure was never confirmed, with reports ranging from $400 million to $650 million). The acquisition was significant for several reasons. It validated London and the UK as a major AI hub. It provided DeepMind with the resources, both computational and financial, to tackle increasingly ambitious challenges. And it set up the conditions for AlphaGo.
The road to AlphaGo began almost immediately after the Google acquisition. Go was chosen carefully. It was a game with simple rules but enormous complexity, respected in Asia as the ultimate test of strategic thinking. Cracking Go would demonstrate AI's ability to handle intuitive, pattern based reasoning, not just brute force calculation. It would show that AI could master tasks requiring what we might call wisdom, not just intelligence.
The DeepMind team spent two years developing AlphaGo in relative secrecy. The system combined several techniques in novel ways. Deep neural networks evaluated board positions, learning to recognise patterns from millions of games.
Monte Carlo tree search explored possible moves, but guided by the neural networks rather than random sampling. Most innovatively, the system improved through reinforcement learning, playing millions of games against itself, discovering strategies no human had ever played.
The first test came in October 2015, when AlphaGo played Fan Hui, the European champion, at DeepMind's offices in London. The building in King's Cross, once a rundown industrial area now transformed into a tech hub, provided a fitting backdrop for this collision of ancient game and cutting edge technology. Fan Hui arrived confident but curious. He left shaken. AlphaGo won all five games. More disturbing for Fan Hui was how it won. Not through calculation but through what seemed like genuine understanding.
The Lee Sedol match was orchestrated with the precision of a major sporting event. DeepMind knew they had one shot at maximum impact. They chose Lee Sedol carefully: respected enough that victory would be meaningful, charismatic enough to draw attention, confident enough to accept the challenge. The venue in Seoul placed the match at the heart of Go culture. The timing allowed for maximum media coverage. Every detail was considered.
The aftermath of Move 37 rippled through the AI community like an earthquake. Researchers who had been working on incremental improvements suddenly saw new possibilities. The move demonstrated that AI could be genuinely creative, not just following human patterns but discovering new ones. Go masters around the world began studying AlphaGo's games, learning from the machine. The ancient game would never be the same.
But a crucial question lingered: was AlphaGo truly intelligent, or was it an extraordinarily sophisticated pattern matching machine? The answer cuts to the heart of what we mean by intelligence. AlphaGo didn't "understand" Go in the way humans do. It had no concept of honour, beauty, or the spiritual dimensions that Go masters speak of. It couldn't explain its strategies in words or apply its insights to other domains. Ask AlphaGo to play chess, and it would be helpless.
What AlphaGo had done was process millions of Go positions, learning statistical patterns about which moves tend to lead to victory. Through self-play, it discovered strategies that worked, without understanding why they worked. Move 37 wasn't a flash of insight but the output of vast computation finding patterns humans had missed. It was brilliant mimicry of intelligence rather than intelligence itself.
If AlphaGo represented AI's coming of age, the release of ChatGPT on 30 November 2022 was its introduction to the masses. OpenAI, which had been founded in 2015 with a mission to ensure artificial general intelligence benefits all of humanity, had been developing increasingly powerful language models. GPT-3, released in 2020, had impressed researchers with its ability to generate coherent text. But it was ChatGPT, a fine tuned version designed for conversation, that captured the world's imagination.
To understand what made ChatGPT revolutionary, we need to understand transformers, the architecture behind modern language models. Imagine you're trying to understand a sentence like "The cat sat on the mat because it was tired." To grasp the meaning, you need to understand that "it" refers to the cat, not the mat. You need to track relationships between words across the sentence. Traditional neural networks processed words sequentially, like reading with tunnel vision. By the time they reached "tired," they'd half forgotten about "cat."
Transformers, introduced in a 2017 paper titled "Attention Is All You Need," solved this through a mechanism called attention. Instead of processing words in sequence, transformers can attend to all words simultaneously, understanding how each word relates to every other word. It's like the difference between reading a sentence word by word through a straw versus seeing the whole sentence at once and understanding how all parts connect.
This attention mechanism allows transformers to capture long range dependencies in text. They can understand that in "The trophy didn't fit in the suitcase because it was too big," the "it" refers to the trophy, while in "The trophy didn't fit in the suitcase because it was too small," the "it" refers to the suitcase. This seemingly simple capability, understanding context and reference, is fundamental to language comprehension.
The timing of ChatGPT's release seemed almost accidental. Sam Altman, OpenAI's CEO, later reflected: "A year ago tonight, we were probably just sitting around the office putting the finishing touches on ChatGPT before the next morning's launch." There was no massive marketing campaign, no staged media event. Just a blog post and a link to try the system. But word spread like wildfire.
What made ChatGPT different was its accessibility and versatility. Unlike previous AI systems that excelled at specific tasks, ChatGPT could engage in open ended conversation on virtually any topic. It could write poetry, debug code, explain complex concepts, and engage in creative storytelling. It felt, for the first time, like talking to an intelligent entity rather than using a tool. Within days, social media was flooded with screenshots of conversations. Some profound, some hilarious, all demonstrating capabilities that seemed almost magical.
But here's the crucial point: Large language models operate through next-token prediction based on statistical patterns learned from vast text corpora. When presented with input text, these systems calculate the probability distribution of possible next words and select accordingly. Recent research by Apple investigators found that these systems rely on sophisticated pattern matching rather than logical reasoning, with performance dropping up to 65% when irrelevant information is introduced.
It's autocomplete on steroids. When it writes a poem about robots, it's not feeling creative or inspired. It's calculating that after "Roses are red, violets are," the word "blue" has high probability.
This predictive nature explains both ChatGPT's capabilities and limitations. It can produce text that seems intelligent, creative, even wise, because it has learned patterns from billions of examples of human writing. But it has no understanding of what it's saying. It can write a recipe for chocolate cake without knowing what chocolate tastes like, explain quantum physics without grasping what an atom is, or offer relationship advice without ever having felt love or loneliness.
The numbers tell a staggering story of adoption. In November 2022, 152.7 million people visited chat.openai.com. By December, this had almost doubled. By February 2023, visitors exceeded 1 billion worldwide. By August 2024, OpenAI reported 200 million weekly active users. This was faster adoption than any technology in history. Faster than the internet, faster than smartphones, faster than social media.
The response from the tech industry was immediate and panicked. Google management called a "code red," recognising that ChatGPT could be an existential threat to their search business. Microsoft moved aggressively, investing $10 billion in OpenAI and integrating ChatGPT technology across its products. Every major tech company scrambled to develop or acquire similar capabilities. The AI arms race had begun in earnest.
The investment explosion that followed dwarfed previous AI funding cycles. Venture capital investments in generative AI exceeded $2 billion in 2022, but this was just the beginning. Projections suggested the generative AI market could grow to $1.3 trillion over the next decade from just $40 billion in 2022. In the UK alone, AI companies raised £2.4 billion in equity investment in 2022, with predictions of £3.4 billion by the end of 2023.
British AI companies emerged as significant players. Wayve burst onto the scene in February 2024 with a massive £822 million funding round from backers including Nvidia and SoftBank, achieving a post money valuation of £2.22 billion. The company's approach to autonomous driving, using end to end deep learning rather than hand coded rules, exemplified the new paradigm of letting AI learn rather than programming it explicitly.
The technical progress continued at breakneck pace. Large language models grew exponentially in size and capability. GPT-3's 175 billion parameters seemed massive until GPT-4 arrived, rumoured to have over a trillion parameters. But what are parameters? Think of them as the dials and knobs that get adjusted during training. A simple thermostat might have one parameter: the temperature setting. A neural network has millions or billions of these adjustable values, each contributing to the network's behaviour. More parameters generally mean more capacity to learn complex patterns, though not always more intelligence.
These models showed emergent capabilities. Abilities that appeared suddenly as they scaled up, without being explicitly programmed. They could translate between languages they'd never been specifically trained on, solve mathematical problems despite not being calculators, and show reasoning abilities that surprised even their creators. But again, this was sophisticated pattern matching, not understanding. A model might correctly answer that "2 + 2 = 4" not because it understands arithmetic, but because this pattern appears countless times in its training data.
DeepMind, not to be outdone, continued pushing boundaries in directions beyond games and language. Their AlphaFold system, announced in 2020, solved a 50 year old grand challenge in biology: predicting protein structures from amino acid sequences.
By July 2022, they had released predictions for over 200 million protein structures, essentially all known proteins. The achievement was so significant that Demis Hassabis and John Jumper received the 2024 Nobel Prize in Chemistry. The practical implications were staggering. Drug discovery could be accelerated, diseases better understood, new materials designed.
The UK government recognised the strategic importance of AI, taking significant steps to support and shape the sector. In November 2023, the UK hosted the first global AI Safety Summit at Bletchley Park, a symbolically resonant location given its role in the birth of computing and code breaking. The summit brought together world leaders, tech executives, and researchers to discuss the profound challenges of ensuring AI development remained beneficial.
The UK's approach to AI regulation, outlined in a March 2023 white paper, sought a middle path between the EU's prescriptive approach and the United States' more laissez-faire stance. Five principles would guide sector specific regulation rather than imposing rigid rules that might stifle innovation: safety, security and robustness; transparency and explainability; fairness; accountability and governance; contestability and redress.
Then, in late 2024 and early 2025, something unexpected shook the AI world. DeepSeek, a Chinese startup, demonstrated that ChatGPT level performance could be achieved with just $5.6 million in development costs, compared to the billions spent on GPT-4. Using innovative techniques like mixture of experts architecture that activated only 37 billion out of 671 billion parameters for each token, they showed that clever engineering could overcome hardware limitations.
The mixture of experts approach is like having a team of specialists rather than one generalist. Instead of using the entire model for every query, the system routes questions to the most relevant "expert" sub networks. Ask about cooking, and the cooking expert activates. Ask about physics, and the physics expert takes over. This dramatically reduces computational requirements while maintaining performance.
The implications were staggering. On 27 January 2025, Nvidia lost $590 billion in market capitalisation in a single day as investors realised that AI development might not require ever more expensive hardware.
DeepSeek priced its models at up to 40 times lower than OpenAI's comparable offerings. The efficiency revolution suggested that AI development, like computing before it, would become progressively more accessible and democratic.
This efficiency breakthrough represented something fundamental: the extension of Moore's Law beyond hardware to AI development itself. Just as computing power doubles while costs halve, AI capabilities were now following a similar trajectory. What once required Google scale infrastructure could now be achieved by startups. What once demanded billions in investment might soon be possible with millions. The democratisation of AI, long promised but slow to materialise, suddenly accelerated.
The transformation of work began immediately. Nearly one quarter of C-suite executives reported personally using generative AI for work within a year of ChatGPT's release. Entire industries began experimenting with AI applications. Professional services firms used AI for research and analysis. Healthcare providers explored AI assisted diagnosis. Educators grappled with AI's implications for teaching and assessment. Creative industries both embraced and feared AI's capabilities in generating text, images, and music.
But a profound question remained: were we witnessing the birth of true artificial intelligence, or simply ever more sophisticated mimicry? Current AI systems, however impressive, remain fundamentally different from human intelligence. They process patterns without understanding meaning. They generate text without knowing what they're saying. They play games without grasping why winning matters. They're like incredibly talented actors who can perform Shakespeare perfectly without understanding love, ambition, or mortality.
The gap between current AI and human intelligence is not just quantitative but qualitative. Humans learn from single examples. A child touching a hot stove once learns about heat and pain. AI systems need thousands or millions of examples. Humans generalise across domains. Learning chess improves strategic thinking generally. AI systems remain narrow specialists. Humans understand causation. We know that umbrellas don't cause rain despite the correlation. AI systems struggle to distinguish correlation from causation.
Most fundamentally, humans have consciousness, subjective experience, the feeling of what it's like to be. We don't just process information; we experience it. The redness of red, the pain of loss, the joy of discovery. These qualia remain entirely absent from even the most sophisticated AI systems. Whether consciousness is necessary for true intelligence remains philosophy's hardest question.
For the UK, the AI explosion represented both tremendous opportunity and significant challenges. The country's strengths were evident: world class universities producing AI talent, successful companies like DeepMind demonstrating global leadership, a pragmatic regulatory approach balancing innovation with safety. But challenges remained in competing with the vast resources of US tech giants and Chinese state investment.
Looking forward from 2025, the AI revolution shows no signs of slowing. Demis Hassabis, reflecting on the journey, offered perspective: "I think we're probably three to five years away" from artificial general intelligence. But he also noted what's still missing: "These systems need the ability to invent their own hypotheses about science, not just prove existing ones." The gap between current AI and human like general intelligence remains vast, but it's narrowing at an unprecedented pace.
The period from 2012 to 2025 will likely be remembered as the moment artificial intelligence went from academic curiosity to world changing technology. The combination of AlexNet's breakthrough demonstrating deep learning's power, AlphaGo's victory showing creative intelligence was possible, ChatGPT's release making AI accessible to everyone, and DeepSeek's efficiency revolution democratising AI development created unstoppable momentum.
For those who lived through this transformation, it felt both sudden and inevitable. Sudden because each breakthrough seemed to come from nowhere. Who could have predicted Move 37 or ChatGPT's viral adoption? Inevitable because, looking back, all the pieces were falling into place. The exponential growth in computational power that Gordon Moore had predicted in 1965 had finally reached the threshold needed for intelligence. The internet had created vast repositories of human knowledge and behaviour perfect for training AI. Researchers had developed the algorithmic frameworks to harness this data and compute.
The UK, despite its smaller size compared to the US or China, punched above its weight throughout this revolution. From DeepMind's pioneering achievements to the emergence of companies like Wayve, from hosting global AI safety discussions to developing pragmatic regulatory frameworks, Britain demonstrated that leadership in AI wasn't just about resources but about vision, talent, and wisdom.
The intelligence explosion had begun, triggered by the perfect storm of data, compute, and algorithms. Where it would lead remained uncertain, but one thing was clear: the world would never be the same. The transformation from digital revolution to intelligence explosion was complete, but the journey toward truly intelligent machines, and understanding what that means for humanity, had only just begun.
As we stand on the brink of even greater breakthroughs, the question is no longer whether AI will transform our world. That transformation is already underway. The question is how we'll shape that transformation to benefit humanity. And that's a challenge that will require not just artificial intelligence, but the very best of human intelligence, wisdom, and values. The explosion continues, reverberating through every aspect of human society, and we're all part of the blast zone.
Yet for all the talk of intelligence explosion, we must remember what we're really witnessing: the emergence of incredibly sophisticated tools that mimic intelligence without possessing it. They're mirrors that reflect human knowledge back at us, transformers that recombine our words in novel ways, pattern finders that spot regularities we've missed. They augment human capability in profound ways. But they don't yet think, understand, or feel. The true intelligence explosion, the creation of machines that genuinely understand, remains a distant dream, one that may require not just more data and compute, but fundamental breakthroughs in our understanding of consciousness and cognition itself.
The explosion continues, but whether it leads to true artificial intelligence or simply ever more impressive artificial mimicry remains the defining question of our age.
What the Research Shows
Organisations that succeed build progressively, not revolutionarily
The Five A's Framework
Your Path Forward
A Progressive Approach to AI Implementation
Each level builds on the previous, reducing risk while delivering value.
Chapter 5 - The Five A's Framework
Chapter 6 - Automation Intelligence
Chapter 7 - Augmented Intelligence
Chapter 8 - Algorithmic Intelligence
Chapter 9 - Agentic Intelligence
Chapter 10 - Artificial Intelligence
Chapter 11 - Governance Across the Five A's
Chapter 12 - Strategic Implementation
Chapter 13 - Use Cases Across Industries
Chapter 14 - The Future of AI