Case Studies

How does Pangram compare against GPTZero?

Jan 22, 2026

Table of contents

Pangram vs. GPTZero: Published Numbers
What does the research show? Pangram vs. GPTZero
Multilingual Performance
ESL Performance
Unreleased Models and GPT-5
Conclusion

The AI Detection Market today consists of several large players. You may have heard of them: Pangram, GPTZero, Turnitin, ZeroGPT, and more. For a comprehensive overview of how these tools stack up, check out our guide to the best AI detectors currently available.

Many of these companies routinely update their models and publish numbers on their performance. Recently, GPTZero launched a summer model update and released new numbers for their performance on a variety of new models. In this blog post, we will compare the performance of GPTZero's new model with Pangram's AI detection including the latest GPT-5 models.

Pangram vs. GPTZero: Published Numbers

Model	Pangram Detection Rate	GPTZero Detection Rate	Better Detector
GPT-5	99.81%	95.0%	Pangram
GPT-5-chat-latest	99.97%	Untested	N/A
GPT-5-mini	99.92%	92.2%	Pangram
GPT-5-nano	99.97%	96.1%	Pangram
GPT-OSS-120b	100.00%	Untested	N/A
GPT-OSS-20b	99.74%	Untested	N/A
GPT4.1	99.48%	96.8%	Pangram
GPT4.1-mini	99.94%	98.7%	Pangram
o3	99.86%	89.9%	Pangram
o3-mini	100.00%	98.4%	Pangram
Gemini 2.5 Pro	99.91%	95.7%	Pangram
Gemini 2.5 Flash	99.75%	98.2%	Pangram
Claude Sonnet 4	99.91%	99.1%	Pangram

Note: GPTZero does not release their internal evaluation datasets to the public, so these numbers are not from the same exact documents. Furthermore, GPTZero does not release the number of documents they test on, so we cannot compare quantity either. However, for Pangram’s performance numbers, we evaluated on thousands of documents per model as well as a wide variety of domains and prompt schemes to simulate real-world use.

Furthermore, Pangram’s accuracy is not limited to flagging the most AI documents. Pangram is also the market leader in keeping low false positive rates. It’s a serious priority for us to not flag human-written documents as AI-generated. Below outlines the difference of the reported False Positive Rates for Pangram and GPTZero:

	Pangram	GPTZero
False Positive Rate (%)	0.01%	1%
False Positive Rate (#)	~1 in 10,000 documents	~1 in 100 documents

GPTZero False Positive Rate Blog Post

Here we see GPTZero’s performance reporting False Positive Rate (FPR) at 1%.

What does the research show? Pangram vs. GPTZero

Pangram and GPTZero have also come head-to-head in peer-reviewed AI research papers. This is best represented in the recent University of Maryland study “People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text.” This study investigated the ability for expert human annotators to classify the difference between human and AI-generated text.

As part of the study, the human annotators were benchmarked against commercially available and open-source detectors. Pangram performed better than each individual human detector, as well as better than all of the commercial alternatives, including GPTZero.

	GPT-4o	Claude
Pangram	100%	100%
GPTZero	100%	97.6%
Annotator 1	96.7%	100%
Annotator 2	96.7%	100%
Annotator 3	86.7%	80%
Annotator 4	90.0%	96.7%
Annotator 5	93.3%	93.3%

Multilingual Performance

The differences between Pangram’s flagship model and GPTZero don’t end there. Both models are “multilingual”, meaning they are able to detect AI across languages more than just english. Pangram is multilingual across all of the top 20 languages on the internet. GPTZero supports English, French, and Spanish. Here are the languages that each model is tested in:

Language	Pangram False Positive Rate (FPR)	GPTZero False Positive Rate (FPR)	Pangram AI Detection Rate	GPTZero AI Detection Rate
Spanish	0.00%	5.6%	100.0%	96.4%
French	0.00%	3.1%	100.0%	93.1%
Arabic	0.10%	Untested	100.0%	Untested
Czech	0.00%	Untested	99.89%	Untested
German	0.00%	Untested	99.68%	Untested
Greek	0.00%	Untested	99.79%	Untested
Persian	0.00%	Untested	100.0%	Untested
Hindi	0.00%	Untested	99.58%	Untested
Hungarian	0.10%	Untested	99.05%	Untested
Italian	0.00%	Untested	100.0%	Untested
Japanese	0.00%	Untested	100.0%	Untested
Dutch	0.10%	Untested	100.0%	Untested
Polish	0.00%	Untested	100.0%	Untested
Portuguese	0.00%	Untested	100.0%	Untested
Romanian	0.10%	Untested	100.0%	Untested
Russian	0.00%	Untested	100.0%	Untested
Swedish	0.00%	Untested	99.89%	Untested
Turkish	0.00%	Untested	99.79%	Untested
Ukrainian	0.00%	Untested	99.89%	Untested
Urdu	0.00%	Untested	98.84%	Untested
Vietnamese	0.00%	Untested	99.89%	Untested
Chinese	0.00%	Untested	99.89%	Untested

For more information on Pangram’s performance on Multilingual text, see this blog post

ESL Performance

Additionally, both models are trained with close attention to ESL performance, as there is a widely-known fear that AI detectors may be biased against non-native english speakers. Both GPTZero and Pangram have published results on ESL text in particular. See how they stack up below:

	False Positive Rate	Sample Size
Pangram	0.032%	25,021
GPTZero	1.1%	91

To read more about Pangram’s approach to ESL text, check out this blog post https://www.pangram.com/blog/how-accurate-is-pangram-ai-detection-on-esl

Unreleased Models and GPT-5

Another concern for those in the market for AI detection is performance on unreleased models. As the AI wars continue to expand, large AI labs and small upstarts release important models on the regular. It’s important for an AI detection solution to continue to provide accurate results on models that they might not have been able to train directly on.

The recent release of GPT-5 provided a great opportunity to figure this out! Within hours of the new model release, the Pangram team tested the performance of GPTZero and Pangram on a variety of prompt types. Here’s how they did:

	Pangram	GPTZero
Document 1	100%	2%
Document 2	100%	0%
Document 3	100%	0%
Document 4	100%	0%
Document 5	100%	9%
Document 6	99%	0%
Document 7	100%	0%
Document 8	100%	0%
Document 9	100%	29%
Document 10	100%	0%
Document 11	100%	10%

Note: GPTZero has since released a model update that claims to perform better on GPT-5! For more details on our original comparison, please see this blog post. Additionally, we encourage users to complete their own tests to compare performance at any given point.

Conclusion

In the end, Pangram continues to be the robust and reliable choice for detecting AI-generated content. Whether your needs are for education, publishing, content moderation, or something even more unique, we're here with accurate and fair AI detection. Learn more on our blog or reach out at info@pangram.com.

Bradley EmiCTO, Co-founder

Bradley is an AI researcher and expert in building deep learning products in industry. He recently led the deep learning research group at Absci, a generative AI drug discovery company, and previously was a member of the core computer vision team at Tesla Autopilot.

While a graduate student, Bradley authored multiple publications in deep learning research with the Stanford Vision Lab. He holds a B.S. in physics and an M.S. in artificial intelligence from Stanford. Aside from AI, he is also excited about education, philosophy, and is an avid golfer.

More from Bradley Emi

Related reading

Pangram Predicts 21% of ICLR Reviews are AI-Generated

Pangram Predicts 21% of ICLR Reviews are AI-Generated

Pangram performed an analysis of all papers and peer reviews submitted to ICLR, a major machine learning publication venue.

Bradley EmiNov 18, 2025

Which AI Detector Is Most Accurate? 30 Tools Tested (2026)

Which AI Detector Is Most Accurate? 30 Tools Tested (2026)

Independent comparison of 30 AI detection tools tested for accuracy, speed, and false positive rates. See how GPTZero, Turnitin, Originality. ai, and others stack up.

Max SperoJan 7, 2026

AI Conference Papers are Increasingly Being Written by AI: up 370% since 2023

AI Conference Papers are Increasingly Being Written by AI: up 370% since 2023

In February 2024, an article published in Frontiers in Cell and Developmental Biology featured figures that were obviously AI-generated.

Bradley EmiSep 30, 2024

How Quora uses Pangram to handle AI-written answers

How Quora uses Pangram to handle AI-written answers

In April 2024, Pangram Labs partnered with Quora to help them take on spammers using ChatGPT to respond to posts with inauthentic, AI-generated answers.

Max SperoSep 26, 2024

Making Your Business LLM And GenAI Proof

Making Your Business LLM And GenAI Proof

Arguably “the person of the year for 2023” has been AI.

Max Spero and Theodoros EvgeniouJan 30, 2024

Three percent of front-page Amazon reviews are now AI-generated

Three percent of front-page Amazon reviews are now AI-generated

Despite the FTC ruling that AI-generated reviews are illegal, bad actors continue to publish LLM generated product reviews that mislead customers. Not even Amazon is able to catch all these reviews!

Max SperoMay 4, 2026

Subscribe
to our updates

Stay informed with our latest news and offers.

soc2

SOC2 TYPE2

Verified by AssuranceLab

© 2025 Pangram. All rights reserved.

info@pangram.com

Join our Community

© 2025 Pangram. All rights reserved.