Photo by Aidin Geranrekab on Unsplash
Photo by Aidin Geranrekab on Unsplash

Is ChatGPT More Creative Than One Human?

A groundbreaking study: An empirical investigation of ChatGPT’s impact on creativity published in Nature Human Behaviour (Lee & Chung, 2024).

Researchers aimed to determine whether ChatGPT is more creative than humans. To do this, they designed six different challenges, three of which included:

  1. Repurposing a tennis racket and a garden hose (Study 2a)
  2. Creating a toy using a brick and a fan (Study 2b)
  3. Repurposing a flashlight and hairspray (Study 5)

In total, between 100 and 200 participants took part in the six creativity challenges. Similar to the “Your Brain on ChatGPT” study, the researchers divided the participants into three groups:

  1. Those who worked solo with no assistance,
  2. Those who used a web search (Google) for inspiration, and
  3. A third group that consulted ChatGPT (GPT-3.5) while brainstorming.

In each scenario, participants were asked to present one idea as their final answer. Afterward, independent evaluators assessed each idea, judging its creativity by considering both its originality and usefulness.

However, the original study had a fundamental flaw: it treated each participant as an independent unit without considering the impact of multiple users. This raises the question: What happens when several people use the AI tool simultaneously? Will participants produce a similar range of ideas, or will they generate a diverse array?

Instead of asking, “Does ChatGPT enhance the creativity of individual ideas?” the next group of researchers, led by Meincke et al., aimed to investigate a different question: “Does ChatGPT promote greater diversity in groups of ideas?”


This study, titled “ChatGPT Decreases Idea Diversity in Brainstorming,” was conducted by The Wharton School and the University of Pennsylvania in 2025. The researchers began by questioning the definition of originality. They noted that originality is not only about the uniqueness of an individual idea but also about how similar or different the ideas are in relation to one another.

The study included a task that involved repurposing unused household items, specifically an old tennis racket and a garden hose. To illustrate the difference between evaluating individual ideas and assessing them collectively, the researchers provided examples.

In one set of ideas, participants suggested reusing hoses as “lawn sprinkler,” “water sprinkler,” and “garden sprinkler.” In another, human-generated ideas included “garden furniture,” “storage rack,” and “picture mount.”

The first set of ideas, while potentially original when considered individually, ultimately revolved around the same concept: sprinklers. The second set, in contrast, consisted of ideas that were not only novel on their own but also conceptually distinct from one another.

This phenomenon was precisely what the researchers observed in brainstorming sessions that involved ChatGPT. Although participants’ AI-assisted ideas were often deemed creative when evaluated individually, the collective pool of ideas tended to cluster around the same themes. For instance, in the tennis racket and hose challenge, many ideas were centered on “sprinklers.”

In Study 2A, which focused on repurposing the tennis racket and garden hose, 20 out of 96 ChatGPT-assisted ideas included the term “sprinkler.” By comparison, only 7 ideas in the web search group and 12 in the group that brainstormed without AI included that term. This means approximately 21% of all ChatGPT responses centered on the same basic concept, indicating a lower level of unique ideas among the responses.

The analysis of Study 2B reveals even more striking results. Participants were tasked with creating a toy using a brick and a fan. This study highlighted the most significant lack of diversity throughout the entire research project.

Only 6% of the ideas generated solely by ChatGPT were unique, whereas 41.2% of the ideas were unique when ChatGPT was used in combination with human input. In contrast, ideas generated exclusively by humans were 100% unique!

From a traditional brainstorming perspective, that approach is not effective.

The main value of brainstorming lies in generating a diverse array of ideas, which allows us to explore a wide range of perspectives.

If everyone arrives with variations of the same idea, it’s likely that they won’t be invited back for future brainstorming sessions. As the authors aptly point out,

The figure below illustrates that the latest LLM models are capturing ideas from a chain of thoughts that are becoming increasingly diverse, resembling the variety of ideas found in the human brain.

Beyond us, they also found that using different LLM models would likely lead to receiving different answers.

More Evidence: Does AI Help or Hinder Creative Diversity?

The researchers also analyzed the similarity of the stories using embedding-based metrics. They found that stories from the AI-aided groups ended up more similar to each other than those from the non-AI group.

If writers were only allowed one AI-generated idea, it would lead to a 10.7% increase in similarity among their stories compared to stories written with no AI at all. As you can see, there is a shift to less diverse ideas in the screenshot below.

if the publishing industry embraced generative AI widely, the collective pool of stories would likely become more homogeneous, less unique in aggregate.