The 4As, IAA and more leading industry groups teamed up with Springboards to compare how top AI tools like ChatGPT, Gemini and Claude perform on creative tasks
New York, NY – October 21, 2025 – A comprehensive new study by Springboards, an AI platform inspiring creativity in advertising, found that popular AI tools like ChatGPT, Gemini, Claude and others perform much more similarly on creative tasks than many people think. Creativity Benchmark, conducted in collaboration with the 4As, ACA, APG, D&AD, IAA, IPA, and The One Club for Creativity, challenges the idea that there's a single "best" AI tool for creative work and shows agencies need more efficient ways to test AI tools for their specific needs.
Sixteen different AI systems – from OpenAI, Google, Anthropic, Meta, DeepSeek, Alibaba and others – were tested on real marketing challenges across 100 notable brands. Over 600 creative professionals from ad agencies, marketing teams, and strategy firms made over 11,000 comparisons to see which ones worked best. The biggest surprise? There was no clear winner. The differences between the "best" and "worst" AI tools were much smaller than expected.
"Everyone assumes some AI tools are way better than others for creative work," said Pip Bingemann, CEO and co-founder of Springboards. "But our tests showed the results were pretty close. Why? Because these models are machines designed to recognize patterns and give you the most probable answer—and 'probable' has never been called 'creative.' Keeping humans in the loop and optimizing for a wider range of varied ideas is crucial.”
The study looked at three types of creative challenges: finding surprising insights about consumers, creating big campaign ideas, and coming up with bold, attention-grabbing concepts.
Key Findings:
“LLMs aren’t a one-size-fits-all solution—they're general purpose tools that require human creativity to unlock breakthrough outcomes," said Jeremy Lockhorn, SVP, Creative Technologies & Innovation, 4As. "These findings suggest agencies and brands should continue to evaluate which models are best suited for creative work - and that a multi-model approach may well be the best path forward."
“This study highlights that creativity isn’t about which AI you use, it’s about how you use it,” remarked Tony Hale, CEO, Advertising Council Australia. “The results reinforce what we see across the industry: the human spark remains essential to transforming good ideas into great ones. For agencies, the real opportunity is learning how to collaborate with these systems to expand, not replace, creative thinking.”
Methodology
The study involved 678 advertising professionals of diverse backgrounds, who participated in blind A/B idea judgments, likened to a "Tinder for Ideas." The data, collected over four weeks starting June 10, 2025, comprised 11,012 human comparisons across various brands, prompts, and models. This was analyzed using Bradley-Terry modeling and cosine distance for diversity scoring.
The research used four different ways to test AI creativity:
All tests used the same settings and compared current AI systems from companies like OpenAI, Google, Anthropic, and Meta.
To access the full research white paper, visit https://arxiv.org/abs/2509.09702.
If you'd like to learn more about the results, visit this page. To access the original research, visit creativitybenchmark.ai
About Springboards
Springboards is an AI-powered platform built to inspire creativity in advertising. The platform empowers teams to explore more ideas, without sacrificing the craft of great work. Founded by industry veterans Pip Bingemann, Amy Tucker, and Kieran Browne, Springboards has already partnered with 150+ agencies globally and secured $3 million USD in seed funding from Blackbird Ventures. For more information, visit Springboards or contact hello@springboards.ai.