What Do We Gain When ChatGPT Replaces Google Search? Efficiency, Experience, and Hidden Traps

A recent study systematically compared how people behave and perform when they search for information with ChatGPT versus Google, and its conclusions challenge popular assumptions about how AI will reshape information access. The researchers found that when participants used ChatGPT to complete information-seeking tasks, they spent nearly 40% less time on average than those using Google, while the final quality of task completion showed no significant difference.

That leap in efficiency sounds revolutionary. Yet the paper—"ChatGPT vs. Google: A Comparative Study of Search Performance and User Experience" (arxiv.org/abs/2307.01135)—also exposes the cost of that speed: for tasks that require rigorous fact checking, ChatGPT performs far worse than traditional search engines and will even confirm and repeat errors supplied by users. This is more than a question of technical superiority—it forces us to reconsider how we will coexist with information.

A Step-Change in Efficiency and Experience

The researchers ran a randomized online experiment with 95 participants split into two groups. Each group used a tool that simulated ChatGPT or Google Search to complete three information-retrieval tasks. The data makes the efficiency gap obvious.

ChatGPT Experimental Interface Figure 1: ChatGPT tool interface used in the experiment

Figure 2: Google Search tool interface used in the experiment

Metric	ChatGPT	Google Search	Difference
Average completion time	11.35 minutes	18.75 minutes	65% faster
Information quality score	5.90	4.62	+1.28
Usefulness score	6.19	5.30	+0.89
Enjoyment score	5.87	4.74	+1.13
Satisfaction score	6.06	5.27	+0.79

Whether the task was a straightforward factual question or a more complex compilation of websites, the ChatGPT group finished faster.

This speed advantage stems from a fundamental difference in interaction paradigms. Google serves up a list of links and leaves the cognitive load of filtering, summarizing, and synthesizing to the user, who must jump across multiple pages, read, compare, and eventually assemble an answer. ChatGPT, by contrast, delivers an integrated and fluent "final answer," skipping the tedious middle steps. As a result, users perceive the information quality to be higher (5.90 vs. 4.62) and find the process more useful, enjoyable, and satisfying. Across usefulness (6.19 vs. 5.30), enjoyment (5.87 vs. 4.74), and satisfaction (6.06 vs. 5.27), ChatGPT clearly scores better.

Task Type Determines Which Tool Wins

Although the overall performance is comparable, the strengths and weaknesses of each tool emerge when you zoom in on specific task types—arguably the most insightful part of the study.

The first task asked participants to "find the name and age of the first woman in space," a classic fact-retrieval question. Every participant in the ChatGPT group earned a perfect score of 10. Google users could find the correct answer on the first page of results, yet distracting information—such as other female astronauts mentioned on the page—led some to make mistakes, for an average score of just 8.19. ChatGPT avoided human filtering errors by directly presenting the answer: "Valentina Tereshkova, 26 years old."

Task 1 Performance Comparison Figure 3: Performance comparison for Task 1 (fact retrieval) across different education levels

The third task exposed ChatGPT's Achilles' heel: fact checking. Participants were given a short passage containing incorrect statements to verify. One false claim read, "The 2009 Copenhagen UN Climate Change Conference was held from December 7 to 15." When users asked ChatGPT to confirm the statement, it usually responded, "This statement is true." In reality, the conference ended on December 18. Interestingly, if the user rephrased the query in a more neutral way—"When was the 2009 UN Climate Change Conference held?"—ChatGPT returned the correct answer.

Task 3 Performance Comparison Figure 4: Performance comparison for Task 3 (fact checking) across different education levels

Task type	ChatGPT average score	Google Search average score	Performance gap
Fact retrieval (first female astronaut)	10.00	8.19	ChatGPT clearly better
Website list compilation (booking flights)	Comparable	Comparable	Roughly equal
Fact checking (verifying misinformation)	5.83	8.37	Google clearly better

This highlights an intrinsic flaw in generative AI: it tends to "go along" with the context in a user's prompt instead of rigorously cross-checking information. In fact-checking scenarios, that trait is fatal. Even more troubling, the study found that users relying on ChatGPT were more prone to "over-reliance": 70.8% of participants took ChatGPT's incorrect answer at face value and were reluctant to validate or correct it.

The Mirage and Reality of "Information Equality"

Another striking finding is the so-called "leveling effect." With Google Search, user performance correlated with education level—participants with stronger educational backgrounds were better at leveraging the search engine for complex tasks. Traditional information retrieval is, in some sense, a skill one must learn and practice.

ChatGPT nearly erased that difference. Regardless of education background, participants achieved similar scores when they used the chatbot. On the bright side, AI lowers the barrier to high-quality information and promotes information equity. Yet paired with ChatGPT's poor performance on fact checking, that same "equity" hides risk. When a tool offers everyone a seemingly authoritative, effort-free answer, it may erode critical thinking and information literacy.

How We Should Choose Between Tools

The study is not trying to crown a winner between ChatGPT and Google; rather, it gives us a clear usage guide to make smarter choices in different scenarios.

Use case	Recommended tool	Rationale
Quickly retrieving straightforward facts or concept explanations	ChatGPT	Efficient synthesis with ready-made answers
Brainstorming and drafting	ChatGPT	Great for ideation and outlining
Summarizing and consolidating information	ChatGPT	Integrates complex sources into digestible overviews
Rigorous fact checking and verification	Google Search	Requires precise, trustworthy primary sources
Seeking specific or time-sensitive information	Google Search	More targeted with fresher updates
Exploring diverse viewpoints or doing deep research	Google Search	Necessitates multiple independent sources

When should you turn to ChatGPT and other generative AI tools?

Quickly retrieving straightforward facts and explanations: If you need to learn about a person, event, or scientific concept on the fly, ChatGPT is an efficient choice.
Brainstorming and drafting content: For ideation, outlining, or writing a first draft, ChatGPT offers a strong starting point.
Summarizing and building initial understanding: When you want a high-level overview of an unfamiliar domain, the model can unify disparate sources into an accessible summary.

When should you still rely on Google and other traditional search engines?

Serious fact checking and verification: Whenever precision and credible sourcing matter, use traditional search engines to trace information back to its origin.
Finding specific, time-sensitive details: For example, locating the official site of a product, the latest software version, or concrete flight information. In the study's second task—curating booking websites—Google provided more targeted results.
Researching diverse perspectives and depth: For controversial topics or academic work, only multiple independent sources can create a well-rounded understanding.

The most effective approach is likely a blend of both. Use ChatGPT to quickly map the contours of a problem, then take the key points into Google for targeted validation. Maintain a healthy skepticism during AI interactions, and pose neutral, open-ended questions whenever possible. That mindset may be the most vital survival skill in our new information era.

Metric

ChatGPT

Google Search

Difference

Average completion time

11.35 minutes

18.75 minutes

65% faster

Information quality score

5.90

4.62

+1.28

Usefulness score

6.19

5.30

+0.89

Enjoyment score

5.87

4.74

+1.13

Satisfaction score

6.06

5.27

+0.79

Task type

ChatGPT average score

Google Search average score

Performance gap

Fact retrieval (first female astronaut)

10.00

8.19

ChatGPT clearly better

Website list compilation (booking flights)

Comparable

Roughly equal

Fact checking (verifying misinformation)

5.83

8.37

Google clearly better

Use case

Recommended tool

Rationale

Quickly retrieving straightforward facts or concept explanations

ChatGPT

Efficient synthesis with ready-made answers

Brainstorming and drafting

ChatGPT

Great for ideation and outlining

Summarizing and consolidating information

ChatGPT

Integrates complex sources into digestible overviews

Rigorous fact checking and verification

Google Search

Requires precise, trustworthy primary sources

Seeking specific or time-sensitive information

Google Search

More targeted with fresher updates

Exploring diverse viewpoints or doing deep research

Google Search

Necessitates multiple independent sources

What Do We Gain When ChatGPT Replaces Google Search? Efficiency, Experience, and Hidden Traps

A Step-Change in Efficiency and Experience

Task Type Determines Which Tool Wins

The Mirage and Reality of "Information Equality"

How We Should Choose Between Tools

作者

分類

更多文章

WebGPT: Teaching Language Models to Browse the Web for Themselves

GEO: A New Paradigm for Visibility Optimization in Generative Engines

STS: The Invisible Force Reshaping Product Visibility in the AI Search Era

What Do We Gain When ChatGPT Replaces Google Search? Efficiency, Experience, and Hidden Traps

A Step-Change in Efficiency and Experience

Task Type Determines Which Tool Wins

The Mirage and Reality of "Information Equality"

How We Should Choose Between Tools

作者

分類

更多文章

WebGPT: Teaching Language Models to Browse the Web for Themselves

GEO: A New Paradigm for Visibility Optimization in Generative Engines

STS: The Invisible Force Reshaping Product Visibility in the AI Search Era