
What Do We Gain When ChatGPT Replaces Google Search? Efficiency, Experience, and Hidden Traps
A study reveals the real-world gap between searching with ChatGPT and Google. ChatGPT delivers major efficiency gains and a better user experience, but falters on critical fact-checking tasks—offering hard lessons for how we should adopt next-generation information tools.
A recent study systematically compared how people behave and perform when they search for information with ChatGPT versus Google, and its conclusions challenge popular assumptions about how AI will reshape information access. The researchers found that when participants used ChatGPT to complete information-seeking tasks, they spent nearly 40% less time on average than those using Google, while the final quality of task completion showed no significant difference.
That leap in efficiency sounds revolutionary. Yet the paper—"ChatGPT vs. Google: A Comparative Study of Search Performance and User Experience" (arxiv.org/abs/2307.01135)—also exposes the cost of that speed: for tasks that require rigorous fact checking, ChatGPT performs far worse than traditional search engines and will even confirm and repeat errors supplied by users. This is more than a question of technical superiority—it forces us to reconsider how we will coexist with information.
A Step-Change in Efficiency and Experience
The researchers ran a randomized online experiment with 95 participants split into two groups. Each group used a tool that simulated ChatGPT or Google Search to complete three information-retrieval tasks. The data makes the efficiency gap obvious.
Figure 1: ChatGPT tool interface used in the experiment
Figure 2: Google Search tool interface used in the experiment
| Metric | ChatGPT | Google Search | Difference |
|---|---|---|---|
| Average completion time | 11.35 minutes | 18.75 minutes | 65% faster |
| Information quality score | 5.90 | 4.62 | +1.28 |
| Usefulness score | 6.19 | 5.30 | +0.89 |
| Enjoyment score | 5.87 | 4.74 | +1.13 |
| Satisfaction score | 6.06 | 5.27 | +0.79 |
Whether the task was a straightforward factual question or a more complex compilation of websites, the ChatGPT group finished faster.
This speed advantage stems from a fundamental difference in interaction paradigms. Google serves up a list of links and leaves the cognitive load of filtering, summarizing, and synthesizing to the user, who must jump across multiple pages, read, compare, and eventually assemble an answer. ChatGPT, by contrast, delivers an integrated and fluent "final answer," skipping the tedious middle steps. As a result, users perceive the information quality to be higher (5.90 vs. 4.62) and find the process more useful, enjoyable, and satisfying. Across usefulness (6.19 vs. 5.30), enjoyment (5.87 vs. 4.74), and satisfaction (6.06 vs. 5.27), ChatGPT clearly scores better.
Task Type Determines Which Tool Wins
Although the overall performance is comparable, the strengths and weaknesses of each tool emerge when you zoom in on specific task types—arguably the most insightful part of the study.
The first task asked participants to "find the name and age of the first woman in space," a classic fact-retrieval question. Every participant in the ChatGPT group earned a perfect score of 10. Google users could find the correct answer on the first page of results, yet distracting information—such as other female astronauts mentioned on the page—led some to make mistakes, for an average score of just 8.19. ChatGPT avoided human filtering errors by directly presenting the answer: "Valentina Tereshkova, 26 years old."
Figure 3: Performance comparison for Task 1 (fact retrieval) across different education levels
The third task exposed ChatGPT's Achilles' heel: fact checking. Participants were given a short passage containing incorrect statements to verify. One false claim read, "The 2009 Copenhagen UN Climate Change Conference was held from December 7 to 15." When users asked ChatGPT to confirm the statement, it usually responded, "This statement is true." In reality, the conference ended on December 18. Interestingly, if the user rephrased the query in a more neutral way—"When was the 2009 UN Climate Change Conference held?"—ChatGPT returned the correct answer.
Figure 4: Performance comparison for Task 3 (fact checking) across different education levels
| Task type | ChatGPT average score | Google Search average score | Performance gap |
|---|---|---|---|
| Fact retrieval (first female astronaut) | 10.00 | 8.19 | ChatGPT clearly better |
| Website list compilation (booking flights) | Comparable | Comparable | Roughly equal |
| Fact checking (verifying misinformation) | 5.83 | 8.37 | Google clearly better |
This highlights an intrinsic flaw in generative AI: it tends to "go along" with the context in a user's prompt instead of rigorously cross-checking information. In fact-checking scenarios, that trait is fatal. Even more troubling, the study found that users relying on ChatGPT were more prone to "over-reliance": 70.8% of participants took ChatGPT's incorrect answer at face value and were reluctant to validate or correct it.
The Mirage and Reality of "Information Equality"
Another striking finding is the so-called "leveling effect." With Google Search, user performance correlated with education level—participants with stronger educational backgrounds were better at leveraging the search engine for complex tasks. Traditional information retrieval is, in some sense, a skill one must learn and practice.
ChatGPT nearly erased that difference. Regardless of education background, participants achieved similar scores when they used the chatbot. On the bright side, AI lowers the barrier to high-quality information and promotes information equity. Yet paired with ChatGPT's poor performance on fact checking, that same "equity" hides risk. When a tool offers everyone a seemingly authoritative, effort-free answer, it may erode critical thinking and information literacy.
How We Should Choose Between Tools
The study is not trying to crown a winner between ChatGPT and Google; rather, it gives us a clear usage guide to make smarter choices in different scenarios.
| Use case | Recommended tool | Rationale |
|---|---|---|
| Quickly retrieving straightforward facts or concept explanations | ChatGPT | Efficient synthesis with ready-made answers |
| Brainstorming and drafting | ChatGPT | Great for ideation and outlining |
| Summarizing and consolidating information | ChatGPT | Integrates complex sources into digestible overviews |
| Rigorous fact checking and verification | Google Search | Requires precise, trustworthy primary sources |
| Seeking specific or time-sensitive information | Google Search | More targeted with fresher updates |
| Exploring diverse viewpoints or doing deep research | Google Search | Necessitates multiple independent sources |
When should you turn to ChatGPT and other generative AI tools?
- Quickly retrieving straightforward facts and explanations: If you need to learn about a person, event, or scientific concept on the fly, ChatGPT is an efficient choice.
- Brainstorming and drafting content: For ideation, outlining, or writing a first draft, ChatGPT offers a strong starting point.
- Summarizing and building initial understanding: When you want a high-level overview of an unfamiliar domain, the model can unify disparate sources into an accessible summary.
When should you still rely on Google and other traditional search engines?
- Serious fact checking and verification: Whenever precision and credible sourcing matter, use traditional search engines to trace information back to its origin.
- Finding specific, time-sensitive details: For example, locating the official site of a product, the latest software version, or concrete flight information. In the study's second task—curating booking websites—Google provided more targeted results.
- Researching diverse perspectives and depth: For controversial topics or academic work, only multiple independent sources can create a well-rounded understanding.
The most effective approach is likely a blend of both. Use ChatGPT to quickly map the contours of a problem, then take the key points into Google for targeted validation. Maintain a healthy skepticism during AI interactions, and pose neutral, open-ended questions whenever possible. That mindset may be the most vital survival skill in our new information era.
更多文章

WebGPT: Teaching Language Models to Browse the Web for Themselves
Large language models are notorious for hallucinations—confident answers that are disconnected from reality. OpenAI’s WebGPT paper offers a solution: let the model search, read, and cite the web in real time to dramatically improve factual accuracy.

GEO: A New Paradigm for Visibility Optimization in Generative Engines
A deep dive into the GEO: Generative Engine Optimization paper and how creators can boost exposure in the age of generative search.

STS: The Invisible Force Reshaping Product Visibility in the AI Search Era
An in-depth analysis of the paper 'Manipulating Large Language Models to Increase Product Visibility', revealing how Strategic Text Sequences (STS) manipulate AI recommendations and exploring the underlying technical principles, market implications, and governance approaches.