
Will ChatGPT Kill Google? AI and the Future of Search
In the quarter-century since its inception, Google has developed an intense choke hold on the search engine marketplace. Recently, though, users and industry experts have seen a sharp decline in the quality of Google searches. Queries are likely to net ads, search-engine-optimized spam, and duplicate results. Meanwhile, as ChatGPT quickly passes 100 million users and Microsoft launches BingGPT, the future of search engines and the web itself is in question.
Despite these trends, Google’s distribution remains unmatched, and the marketplace is hostile to most competitors. The company’s deep pockets insulate it from lawsuits, criticisms, and accusations of anti-competitive behavior. Although Satya Nadella, the CEO of Microsoft, said they made Google "dance” as the latter’s stock price dropped on the announcement of Bard, Microsoft’s large language model competitor, Google is still in a good position to fight, even as large language models upend the search world.
The importance of being the default search engine
A search engine’s most basic job is to pull information relevant to a user’s query from the internet and rank it in order of relevance. Robots called “crawlers” retrieve information by following links and “reading” the content. Google developed the infrastructure to crawl 10 billion web pages by 2005—a number that modern competitors struggle to match. Google figured out how to re-crawl pages for fresh information and parse non-standard websites.
When the web was relatively young, it became standard for sites to specify which of their pages a search engine was allowed to crawl in a file called “robots.txt.” Websites use robots.txt to prevent bots from redistributing their content and slowing their servers with constant traffic. Over time, sites only whitelisted Google, and not other search engines, giving Google an edge.
Google became indomitable at filtering out spam and poor-quality websites, ordering sites with its robust PageRank algorithm, and delivering results quickly. But distribution—the sheer number of people “forced” to use the search engine—is the biggest factor in Google’s current dominance.
Early on, Google understood the power of being the default, first with Internet Explorer’s add-on Google Toolbar, then the web browser Google Chrome. This ensured that users didn’t have to type “Google.com” into their web browsers to access the engine. Neither Toolbar nor Chrome made any money for Google directly; their purpose was simply to make the search ubiquitous.
When mobile phones consumed a significant portion of the average user’s computing, Google paid to be the default on iOS and developed its own mobile operating system, Android. Prioritizing internationalization, natural language understanding, and translation ability extended Google’s reach beyond the English-speaking world.
Larger distribution allowed Google to rack up more keystrokes, users, and dollars than its competitors, and gave it access to a vast “moat” of user data. The company’s scope and financial resources also allow it to access proprietary (owned by a specific company or group) or expensive data. This detailed data allows Google to improve rankings to better meet user preferences.
In its early days, Google had already managed to attract PhDs from top schools, like Stanford, UC Berkeley, and MIT, and its growing prominence in the tech world continued to entice top engineers. Google’s dominance became self-sustaining, and it stayed on top because it was on top.
Why is Google search getting worse?
While Google is still the largest search engine on the market, its quality has worsened. The same wide, international distribution that led to its success now has led to queries returning more generic results. Because it’s focused on users worldwide, Google’s ranking is tuned to the masses, not the individual. This is especially true for non-English speakers since the web is heavily biased toward English. Google results also tend to focus on relevance rather than diversity, so those looking for a variety of information might be disappointed.
Webmasters are determining how to design websites that rank high with Google by posting content calculated to secure a prime position, a phenomenon known as search engine optimization (SEO) spam. Additionally, any query that seems vaguely commercial will return advertisements; users searching for “hair dryer,” will have to scroll past a wall of ads to reach non-sponsored results.
Google responds best to general, short queries, known as “head queries,” as well as navigational and informational queries. Entering a “tail query" (also called a "long-tail query") a highly specific question, such as “why won’t my Dyson hair dryer turn on?” is likely to net plenty of unhelpful results. Often, users must append the name of the popular all-purpose forum “Reddit” to their tail queries to find answers.
The web has also shifted, and perhaps begun dying, since the beginning of Google’s reign. As websites give way to social media, potentially useful information is now locked in “walled gardens,” such as Facebook and TikTok, that Google is blocked from crawling. Before ChatGPT, Google was alarmed at the number of young users gravitating to TikTok as a search tool. The rate of website creation has slowed, as financial incentives have decreased. Information on the internet is failing to keep pace with the real world, and many queries just don’t deliver good answers.
Did Google kill the web?
Google had a hand in fraying the web ecosystem on which it relies. For popular head queries, Google often grabs results from a website and shows them directly on the search results page. Websites, however, rely on user traffic to sell themselves to advertisers, so this hurts them financially.
For example, Google once sourced the alleged net worth of celebrities from a site called CelebrityNetWorth.com. If a user googled “Leonardo DiCaprio net worth,” they’d see the answer pop up without having to click over to the website itself. This substantially decreased CelebrityNetWorth’s traffic and revenue.
Yelp’s founder also accused Google of anti-competitive behavior, citing the fact that Google Maps results rank above Yelp’s, which poaches Yelp’s traffic. Similarly, Google Flights was criticized because it aggregates information from a variety of other travel sites and then ranks itself above them in search results.
This is a popular business tactic in technology. Companies first become the aggregator, then eat the businesses they aggregate with their newfound control. Amazon does the same thing. Google has the resources to handle the resulting lawsuits and fines, and criticisms have not dented its marketplace dominance.
Google even offered to rank websites higher in exchange for showing their results directly, a practice they call Accelerated Mobile Pages (AMP). This turned out to be something of a catch-22. Adhering to AMP costs websites ad revenue; refusing to cooperate, however, resulted in Google rankings so low that sites lost traffic and ad revenue anyway. Websites that displayed information relevant to common head queries, like sports and news, were hit especially hard.
Can Google be beaten?
Google’s only real competitor is Bing, which has around 8% of the global market share, mostly because it is the default search engine on Microsoft Edge. To ascertain its own dominance, Google runs vertical-by-vertical comparisons with Bing to confirm that its results rate better than Bing’s with test users.
Startups launching their own search engines do exist, including Neeva, You.com, DuckDuckGo, and Marginalia. Some of them are even implementing large-language models, but their path to widespread adoption is steep. They don’t have Google’s “data moat,” so their ranking models are often subpar. They suffer from high customer churn that occurs when users often hear about a new search engine, try it, and abandon it in favor of more established alternatives.
Because it is difficult to ask users to pay to use a substandard search engine, they struggle to generate revenue—revenue they desperately need to crawl, index, and serve billions of web pages.
The dawn of large language models and the future of search
Google, however, has declared an internal “code red” in response to one threat: ChatGPT. ChatGPT is currently the most famous example of a generative AI technique called a large language model, or LLM. It's essentially a computer program that draws from a massive body of preexisting text to create sentences that pass syntactic muster and even appear meaningful.
LLMs perform well with tail queries, but do have drawbacks. They sometimes “hallucinate” incorrect information with a high degree of confidence. They are slower and more expensive than traditional search engines and it’s unclear whether language model-based searches could generate enough ad revenue to sustain themselves. Retraining models with recent information is difficult and expensive.
There are also unanswered legal and ethical questions about the proper use of copyrighted material in LLMs, though some startups, including You.com, Neeva.com, and Perplexity.AI, are experimenting with feeding the top search results into a language model to generate a “cited result.”
In addition to Bard, Google has other state-of-the-art LLMs, some through its subsidiary DeepMind, that it has not made publicly available, including Flan-PaLM, Chinchilla, Sparrow, and LAMBDA. Crucially, it has yet to integrate these models into Google search. Meanwhile, Microsoft, which owns OpenAI, the company that makes ChatGPT, is integrating ChatGPT with Bing to wide praise.
The decline of useful, quality information on the internet has implications for large language models. Research has shown that large training datasets, even larger than the web, are the bottleneck to scaling language models while maintaining performance, and OpenAI is enlisting teams to crawl for more data. A reduced amount of good-quality web content could slow the improvement of LLMs. The ranking algorithms that power both traditional and LLM-based search engines also depend on ranking signals: data about what users hover, search, and click on. If users no longer need to interact with websites, the quality of the algorithms will suffer. Moreover, a language-model-based search would further cut into ad revenue, disincentivize website creation, and weaken the internet ecosystem, expanding the vicious cycle.
The declining quality of Google search, along with the advent of technologies like large language models, has allowed some to imagine a world where Google’s dominance is broken by a scrappy challenger. But Google’s vast distribution, near-monopoly status, and unchallenged tech and research abilities mean that it is well poised to adopt and integrate new technologies before competitors.
Even as Bing offers AI-aided search with its new feature BingGPT, it remains an open question whether that will be enough to convince consumers to switch. Google developed its own AI-accelerator chip, called the Tensor Processing Unit (TPU), to aid in machine learning and neural net development, and it continues to develop and release its own large language models that surpass the capabilities of many competitors.
Google still generates nearly 60% of its revenue through search. If competition from Bing forces Google to shift to language-model based search, Google could be confronted with higher costs, lower ad revenue, and fewer ranking signals, all while the larger ecosystem of the web continues to crumble.
FAQs about ChatGPT
Below we have summarized the most important questions and answers on the subject:
Can ChatGPT replace Google search?
While Google is better suited for “head” queries such as sports, news, people, and local information, ChatGPT is better for “tail” queries like “Where can I find the best toaster?" or “Help me solve this code problem.” Search is largely controlled by distribution and, for now, Google has greater distribution than Bing.
What is Google's version of ChatGPT?
Google’s Bard chatbot has other state-of-the-art large language models that are not available to the
public and have not yet been integrated into its search, as Bing has with ChatGPT. Bard works differently from ChatGPT and ranks various generated answers against each other, resolves math, and looks up factual data.
Are Google search results getting worse?
Yes, at least for certain queries. Users and industry experts have noticed a sharp decline in the quality of results, which are now more likely to include ads, SEO spam, and duplicate results. More and more people have been seeing that AI-generated results affect search. Organizationally, Google is focusing less on web ranking and more on “knowledge.”
About the Author
Post by:Debarghya (Deedy) Das
Debarghya (Deedy) Das is on the founding team of Glean, a unicorn enterprise search startup, where he leads several engineering teams in search and intelligence. Prior to that, he built search products for Google across New York, Tel Aviv, and Bangalore. He writes about the technology and the tech industry on his blog and Twitter and has been featured in global news publications. He is also an independent tech consultant and active angel investor.
Company: Glean
Website:
www.glean.com
Connect with me on
LinkedIn and X.