
A new study from UC San Diego found that advanced AI models like GPT-4.5 are now so convincing in conversation that they can pass as human, 73% of the time.
In modern Turing Tests, people often choose the AI over real humans in five-minute chats, primarily when the AI uses a “PERSONA” prompt to sound more lifelike.
Experts say these bots could soon handle roles in customer service, online companionship, and beyond, raising questions about how we connect in a world of human-like machines.
According to lead researcher Cameron Jones, GPT‑4.5 with a strategic “PERSONA” prompt managed a win rate of 73%—meaning that in five-minute chat sessions, the AI system was identified as the human more often than the actual human was.
Llama‑3.1‑405B also crossed this threshold (albeit at a lower 56% win rate) when similarly prompted to adopt a specific persona.
By contrast, GPT‑4o, a reference model presumably powering today’s widely used ChatGPT, only managed a 21% success rate under minimal instructions. These results have reignited the debate about whether Turing’s imitation game is still a meaningful measure of human-like intelligence or if it mostly underscores modern AI’s ability to imitate human conversation.
A British mathematician and computer scientist, Alan Turing, first proposed his imitation game in 1950 as a thought experiment.
If an interrogator could not reliably tell the difference between a human and a hidden machine in text-based conversation, Turing reasoned the machine might be said to “think.”
Generations of AI enthusiasts have used the Turing Test as a yardstick, albeit one that was initially more philosophical than technical. Yet, over the decades, multiple chatbots have been said to have “passed” the Turing Test—often with disclaimers.