Bradley Emi, CTO of Pangram Labs, gave a session about the State of AI Detection at the ICAI conference.
Students are both using and abusing ChatGPT. Most students use AI tools regularly and believe their performance will improve with these tools. Even with clear policies against AI use, students are likely to continue usage.
Contrary to popular belief: AI can be detected. The language, style, and semantic choices can be discerned by both humans and automated software (with sufficient training).
LLMs are probability distributions that learn through large amounts of data. They are NOT the average of all human writing. This is due to the way models are trained.
Models are trained in three stages: Pretraining, Instruction Tuning, and Alignment.
In the Pretraining stage, the model analyzes statistical patterns from a large dataset. The training dataset contains biases that show up in the statistical patterns. For example, data that frequently appears on the internet is overrepresented. In a Guardian article, Alex Hern explains how workers in Kenya and Nigeria were exploited to provide training data for OpenAI. The words that these workers frequently used like “delve” and “tapestry” are the same words that frequently appear in AI-generated text.
In Instruction Tuning, the model is trained to respond to prompts. The model learns that it is better to follow instructions than it is to present accurate, correct information. Even when safety filters are implemented, misinformation still plagues AI writing as it attempts to please the user.
During alignment, the model learns the difference between good and bad responses to prompting. Preference data can be extremely biased, as it is based on the trainer’s viewpoints, not necessarily on facts.
We have provided a sample of the most common words and phrases used in AI writing. These come from biases introduced in the Pre-training stage.
AI is known for highly structured language and formatting. Transition phrases, bulleted lists and neat writing are prevalent in AI writing due to the Alignment stage.
AI writing is often formal because formal text is overrepresented on the internet, thus overrepresented in AI training datasets. Positivity and helpfulness are reinforced during Alignment.
Note: Pangram does not predict AI use just because a text contains common AI language and formatting.
We studied 19 different humanizer tools and made one of our own. And we found that AI humanizers preserve original meaning to varying degrees (ranging from slight edits to unintelligible text). Some humanizers do a good job of paraphrasing but do not evade detection. The more fluent humanized text is, the less likely the text is to evade detection. Humanizers are able to remove Google’s SynthID watermark (which is used to mark Gemini-generated text).
The first generation of AI detection tools and their flaws have shaped the way the general public feels about AI detection. These tools relied on correlations to AI use rather than causative signals. They claimed 99% accuracy, which is unsuitable for academic use.
This new generation of detection tools boast >99.9% accuracy and very low false positive rates (FPRs)! They are also robust to paraphrasers and humanizers.
However, AI detectors are not the same! There are varying degrees of accuracy due to the different ways detectors are trained.
Pangram, TurnItIn, and Ghostbusters use learning-based detection. In learning-based detection, the model is trained by learning what is and is not AI-generated from a large sample. While the
Human experts who have experience with using LLMs for writing tasks can detect AI with 92% accuracy. Linguists were not able to achieve the same level of accuracy without experience using tools like ChatGPT. Human detectors are able to expand on why they chose a specific prediction regarding text. While Pangram has higher accuracy and false positive rates, it is unable to contextualize the text.
In creating policies or standards regarding AI use, communication must be clear. AI may be used for writing outlines, generating ideas, editing grammar mistakes, research, drafting, or substantial writing tasks. Guidelines about what degrees of AI use are permitted or not permitted must be implemented.
Students and teachers must understand the ways common tools are evolving with AI. Google Docs’ “Help me write” function gets its results from Gemini. Grammarly currently includes AI generation and paraphrasing. Translation tools may be using LLMs to function. Taking from sections from AI-generated research or brainstorming triggers detection as well.
We recommend the use of both human reasoning and automated detection. It is incredibly unfair to the student to exclusively use AI detection to evaluate their work due to the 0.01% FPR. After receiving a positive prediction, next steps would be to evaluate the student’s writing process and compare the positive text to their previous work. Be sure to test the detector with a few texts and to consider the results you might get when using an LLM for the assignment.
If it becomes increasingly clear that a student has turned in an AI-written assignment, this may be a teachable moment. It is important to treat students with respect and avoid being overly punitive. Students may benefit from making up the assignment and having a conversation about what contributed to AI use.
For more information about this article, please check out the full webinar: https://www.pangram.com/resources/the-state-of-ai-detection-in-2025.
