Essay
The Quiet History of AI — and why it matters for your business now
A working brief on where today's AI came from, which patterns have held across seventy years of progress, and what those patterns mean for the decisions you're facing right now.

The headlines would have you believe artificial intelligence arrived in late 2022, conjured from nothing by a San Francisco chatbot. Founders are scrambling to respond. Consultants are evangelizing. Every board deck now has an “AI strategy” slide, and most of them say roughly the same thing.
Here is the quieter, more useful truth: the technology behind today's AI tools has been developing for roughly seventy years. The milestones that actually matter — the ones that explain what AI can and cannot reliably do for your business — happened long before anyone outside a research lab was paying attention. Understanding that history will not make you an engineer. But it will make you a much harder person to mislead.
This is not a think piece about whether AI is overhyped. It is a working brief on where the technology actually came from, which patterns have held across seven decades of progress, and what those patterns mean for the decisions you are facing right now.
The founding summer nobody remembers
In the summer of 1956, a small group of researchers gathered at Dartmouth College in New Hampshire for what is now called the Dartmouth Summer Research Project on Artificial Intelligence — widely regarded as the founding event of AI as a formal field. The term “artificial intelligence” itself was coined there, proposed by John McCarthy.
The ambition was enormous. The participants believed that “every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.” They were wrong about the timeline. They were not, it turned out, wrong about the destination.
What followed was decades of oscillation: bursts of optimism, followed by hard encounters with complexity, followed by what researchers came to call “AI winters” — periods of deflated funding and expectations. Understanding those cycles is the most practical thing a business leader can take from this history.
The 1980s: The first time AI was going to change everything

By the early 1980s, a new approach called expert systems had captured corporate attention. These were programs that encoded human expertise as chains of if-then rules. Feed in the right inputs; get the expert's answer back. No human expert required.
The pitch worked. Programs like MYCIN (medical diagnosis) and XCON (configuring Digital Equipment Corporation's computer orders) delivered genuine value in controlled settings. A whole industry grew around “knowledge engineering.” Companies built AI departments. Governments poured in funding. Expert system startups proliferated.
Then reality arrived. As researchers at the ACM later documented, the market collapsed because companies discovered that “systems designed to automate expertise required them to hire new experts to maintain them.” DEC alone had 59 technical staff assigned just to maintain its internal expert systems. The knowledge bases grew brittle. Rules interacted in unpredictable ways. Real-world complexity exceeded what any rule set could capture.
By the early 1990s, the expert systems market had collapsed, and AI entered its second winter. Researchers quietly relabeled their work as “machine learning” or “statistics” to avoid the funding stigma attached to the AI label. The lesson, visible in hindsight: narrow systems built on hand-crafted rules don't scale, don't learn, and don't survive contact with messy reality.
Sound familiar? It should. Every era of AI investment has contained a version of this story: a new capability arrives, gets applied to problems it is only partially suited for, and then buckles under the weight of expectations. What survives the collapse is always narrower, more reliable, and less exciting than the original pitch — but also genuinely useful.
The 1990s–2000s: AI goes quiet — and starts working

This is the chapter that matters most, and the one that gets the least attention.
Through the 1990s and into the 2000s, a quieter revolution was underway. Researchers had stopped trying to encode human expertise manually and started letting computers find patterns in data. Statistical methods. Probabilistic reasoning. Machine learning in its early, unglamorous form.
The results were not flashy. They were useful.
In 2002, programmer Paul Graham published “A Plan for Spam”, describing a Bayesian filtering approach that could classify email as spam or not-spam by learning from examples — no hand-written rules required. Within a year or two, Bayesian filters were a standard layer in every serious email defense stack. Most people never knew the name of the technique. They just noticed their inbox got cleaner.
At roughly the same time, Google was using machine learning to rank search results. Amazon was building recommendation systems. By the mid-2010s, Netflix was publicly attributing roughly three-quarters of viewing activity to its recommendation system — citing the Netflix TechBlog — and estimating the annual value of personalization at more than $1 billion.
None of these systems made headlines. They quietly became infrastructure. If you used email in 2003, a machine learning model was reading your messages before you did and making decisions about which ones you should see. If you used Google in 2005, a system trained on billions of data points was deciding what counted as relevant. Most users had no idea. They just found these products useful.
The best AI in your business next year will probably be the AI you stop noticing.
This is the pattern that keeps repeating. The AI that actually changes behavior — spam filters, search ranking, fraud detection, recommendation engines — tends to be seamlessly integrated by design. It runs in the background. It improves incrementally. It doesn't announce itself. The tool may be visible; the friction of using it has disappeared.
2012–2017: The architecture underneath what you're using now

The tool you are evaluating today is the downstream product of specific technical insight, published in specific papers, by specific engineers, seven to twelve years ago. It did not materialize from nowhere.
In the autumn of 2012, a team at the University of Toronto — Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton — entered a neural network called AlexNet into the ImageNet image recognition competition and won by a margin that made every other approach look like guesswork. In practical terms, it meant computers could finally see reliably enough to inspect parts on a manufacturing line or read messy, handwritten invoices. The deep learning era had begun, quietly, without fanfare in the business press.
In 2017, eight researchers at Google published “Attention Is All You Need”, introducing the Transformer architecture — a new way for neural networks to process sequences of information by learning which parts to pay attention to rather than processing everything in order. The title is understated to the point of dry humor. What it meant in practice: a computer could finally read a 50-page legal contract and remember what was on page 1. The Transformer is the architecture underlying nearly every large language model in use today — GPT, Claude, Gemini, and their descendants.
One more thing worth saying plainly: the 2010s leap wasn't just clever algorithms. It was also brute-force compute and data at a scale previously unimaginable. Computer scientist Rich Sutton has argued that the consistent lesson of AI history — what he calls the “Bitter Lesson” — is that methods that can scale with more compute eventually beat methods that rely on hand-crafted human knowledge. Algorithms alone didn't close the gap between the 1980s and today. Cheap GPUs, cloud infrastructure, and training sets scraped from the entire internet did. The magic has a power bill.
2020–2022: The part you already know
GPT-3, released by OpenAI in 2020 with 175 billion parameters, was the first large language model to make general-purpose language capability broadly impossible to ignore. Two years later, when OpenAI wrapped that capability in a conversational interface and released ChatGPT in November 2022, the public had its moment of recognition.
One million users in five days. One hundred million in two months.
But note what had actually happened. The underlying architecture had been published in 2017. The scale of compute and data needed to exploit it had been accumulating for a decade. The moment of apparent sudden arrival was the final visible step of a very long staircase.
What the pattern tells you

Seventy years of AI history contains a lesson that repeats with unusual consistency:
The hype cycle and the useful cycle are different cycles.
Expert systems in the 1980s: enormous hype, genuine early utility, then collapse under the weight of promises that outran capability. Spam filters, search ranking, recommendation engines: no hype, widespread adoption, lasting utility. Deep learning from 2012: moderate hype among researchers, transformative results in production systems. ChatGPT: enormous hype, genuine capability, unknown long-term shape.
The tools that ended up running quietly in the background — the ones your employees use without thinking about them — were almost never the ones that led the news cycle.
For a founder or operator making decisions about AI adoption right now, this history suggests a few practical principles:
Chase the durable pattern, not the headline tool
The specific product you read about this week may not exist in its current form in three years. The underlying capability — language understanding, pattern recognition, document processing, workflow automation — will. Evaluate AI investments against durable capabilities, not brand names.
Reliability beats capability on a benchmark
The 1980s expert systems often performed impressively in demonstrations. They failed in production. The organizations that got lasting value from AI in the 1990s and 2000s were the ones that built systems that could run unattended, handle unexpected inputs gracefully, and degrade cleanly when they hit the edge of their competence. That standard has not changed.
When evaluating any AI tool for your business, the questions that matter are not “what is its benchmark score?” or “what is the parameter count?” They are: What happens when it fails? How does it fail? Who notices, and how quickly?
A tool that is 85% accurate and fails gracefully and auditably is worth far more to a 50-person company than a tool that is 98% accurate and produces confident wrong answers that nobody catches until the damage is done.
The best outcome is seamless integration
A well-implemented AI system — whether it is a visible reasoning assistant your team talks to daily or a background process they never see — should eventually feel unremarkable. Not because the tool hides itself, but because the friction of using it has disappeared. The goal is not to have an impressive AI story for the next investor deck. The goal is to have a business that operates more accurately and efficiently six months from now than it does today, in ways your team no longer has to think about. The integration compounds quietly.
Before you spend a dollar: five questions
If you are evaluating AI for your business right now, the history above is not background trivia. It is the basis for asking the right questions before you commit. Run through these:
1. Is the task high-volume and repetitive?
AI earns its keep on tasks that recur constantly and follow recognizable patterns. One-off judgment calls are still human work.
2. Is the cost of an error low, bounded, or reviewable?
A wrong answer in a support-ticket summary costs almost nothing to catch. A wrong answer in a regulatory filing costs considerably more. Know which one you are building.
3. Do we have clean enough inputs?
The 1980s expert systems failed partly on garbage-in, garbage-out. Modern models are more tolerant, but not infinitely so. If your underlying data is a mess, fix that first.
4. Can we measure success in a month?
If you cannot define what “working” looks like in thirty days, you are not ready to deploy — you are ready to experiment. Treat it accordingly, with a budget that reflects that.
5. Who owns the fallback when the model is wrong?
Every AI system will be wrong sometimes. The question is not whether it fails; it is whether you have a named person and a clear process for what happens next.
Starting quietly
Is this tool's capability genuinely mature, or are we in a 1985-style expert system moment? Does this vendor's pitch rest on benchmark performance or on production reliability? Is the thing we are automating simple enough that an imperfect system doing it consistently will still beat the current process?
Those are answerable questions. They don't require a computer science degree. They require the same judgment you already apply to any other operational decision.
