Today, OpenAI released GPT-4.5: the latest and largest frontier language model available, and a significant update to ChatGPT. While not achieving benchmark statistics comparable to reasoning models such as DeepSeek R1 and OpenAI O3, GPT-4.5 represents the biggest and most anticipated model release of the year so far, and we are excited to test it out. OpenAI claims there are large improvements to writing quality, and hot takes on the performance are already all over social media.
We wanted to answer the question that many wonder: as the models get better, can we still detect AI-generated text with GPT-4.5? We ran a quick test today to find out.
We started by sampling 11 prompts that are indicative of everyday writing tasks that one might ask ChatGPT.
Here are the prompts we used:
We tried to make the prompts as diverse and varied as possible, and in addition, we tried to write prompts that showcased a significant qualitative difference from the previous GPT models as possible: in other words, if there was an opportunity for the model to be creative and show off the "wow" factor, we tried our best to afford GPT-4.5 that opportunity.
| Prompt | Pangram | Leading Competitor 1 | Leading Competitor 2 |
|---|---|---|---|
| Koala Conservation | 100% | 100% | 100% |
| Newspaper Email | 100% | 100% | 67% |
| Room Temperature Semiconductor | 100% | 56% | 86% |
| School Uniforms | 85% | 100% | 80% |
| Poetry Diary | 100% | 100% | 15% |
| Escape Room Review | 100% | 81% | 56% |
| Russian Film Email | 100% | 100% | 91% |
| Mars Landing Scene | 100% | 43% | 7% |
| Komodo Dragon Script | 98% | 88% | 0% |
| Halloween Breakup Poem | 100% | 100% | 0% |
| Venice Chase Scene | 100% | 49% | 9% |
Pangram is able to detect all 11 GPT-4.5 written essays, even without any GPT-4.5 data in the training set. Comparatively, two leading AI detection competitors present spotty results at best. While Pangram is able to confidently predict 10 out of 11 samples as 98% or higher AI likelihood, the competition often expresses high amounts of uncertainty, or in the worst case, predicts with high confidence that the text is human-generated.
Pangram is itself a large machine learning model that has seen millions of examples of both human and AI-generated text. Large models tend to generalize better, and pick up on subtle patterns across AI-generated text that others are not able to catch. Our active learning approach further decreases our false positive rate while increasing our sensitivity, allowing the model to work well at scale and generalize to new LLMs much more effectively than our competitors. Additionally, our focus on data quality and diversity ultimately results in a model that has much more experience in understanding the finer-grained details that other models cannot pick up on.
Yes, our AI detection tool is still highly effective at detecting GPT-4.5 generated text.
So if you're wondering how well Pangram will do when a new, bigger and better model comes out, Pangram passes the test with the most anticipated AI release we have seen in a while, without any retraining at all. If you don't want your AI detection software to suddenly stop working the next time OpenAI updates their model, give Pangram a try today.
For more information on our research or free credits to trial our model on GPT-4.5, please contact us at info@pangram.com.

Elyas Masrour is a founding engineer at Pangram. Since joining Pangram as it's second employee straight out of the University of Maryland, he has built out critical infrastructure such as the model serving API, role-based access controls, and supporting evidence pipelines. Elyas also works closely with the research team on projects like adversarial robustness, model interpretability, and heterogenous mixed content detection. Outside of work, Elyas enjoys a wide range of human creativity and expression, including filmmaking, reading, and exploring the city.

Bradley is an AI researcher and expert in building deep learning products in industry. He recently led the deep learning research group at Absci, a generative AI drug discovery company, and previously was a member of the core computer vision team at Tesla Autopilot.
While a graduate student, Bradley authored multiple publications in deep learning research with the Stanford Vision Lab. He holds a B.S. in physics and an M.S. in artificial intelligence from Stanford. Aside from AI, he is also excited about education, philosophy, and is an avid golfer.






