Pangram detects GPT-5 with 99.8%+ accuracy! Learn more
More and more code is being written with AI every day. According to Sundar Pichai, CEO of Google, over 25% of Google's code was written by AI as of late 2024. Robinhood's CEO says that most of the code shipped at Robinhood is now written by AI. The term "vibe coding" (popularized in a tweet by Andrej Karpathy) has come into the public lexicon: meaning when you fully give into the "vibes" of coding and let AI take the wheel and write the code for you.
Startups such as Cursor, Lovable, and Replit are trying to remove the entry barrier to coding: meaning getting into programming is so easy that anyone at the company can produce code, or even a make full blown website or app without any knowledge of Python or React.
The 2025 StackOverflow Developer Survey reveals just how widespread this trend has become. 84% of developers are using or planning to use AI tools in their development workflow, with 51% of professional developers using AI tools daily. This represents a significant shift in how code is being written across the industry.
However, the survey also reveals growing pains in this AI-assisted development era. While 52% of developers report that AI tools have positively impacted their productivity, positive sentiment toward AI tools has dropped from 70%+ to 60% in 2025. After an initial honeymoon period of exploration with these AI-generated tools, it appears that developers feel more neutrally towards them now.
The source of frustration is telling: 66% of developers are frustrated by "AI solutions that are almost right, but not quite" and 45% find that debugging AI-generated code is more time-consuming than expected. Only 3% of developers "highly trust" AI tool output, with 46% actively distrusting AI tool accuracy.
This creates an interesting paradox: developers are increasingly relying on AI to write code, but they don't fully trust what it produces. As the survey notes, 75% of developers would still ask a human for help when they "don't trust AI's answers," positioning themselves as the "ultimate arbiters of quality and correctness." According to Simon Willison, he "wouldn't use AI-generated code for projects he planned to ship out unless he had reviewed each line. Not only is there the risk of hallucination but the chatbot's desire to be agreeable means it may say an unusable idea works. That is a particular issue for those of us who don't know how to edit the code. We risk creating software with inbuilt problems."
While AI-generated code is here to stay, there are definitely some places where it still makes sense to verify that code is human written.
In the hiring process, when hiring a software developer, it is important to evaluate that the programmer is fully capable of writing high quality code without the assistance of AI. Additionally, it is also important to assess their understanding of the code so that they can successfully debug and diagnose faulty AI-generated or AI-assisted code in their job.
In education, it is important to teach students how to program without AI assistance. With too much AI assistance, students can miss fundamental concepts and bypass learning the skills that they need in order to be successful software engineers. Although it is likely that these students will have access to AI assistance during their jobs, as alluded to by the StackOverflow developer survey, without a solid foundation, students will not be able to fix incorrect AI-generated code or even be able to understand what is wrong in the first place.
Compliance and security. Many compliance frameworks consider AI-generated code to be higher risk due to potential hallucinations and bugs. There are also important licensing and copyright considerations - AI models may inadvertently reproduce code with incompatible licenses, leading to compliance violations. Additionally, there are open questions around whether AI-generated code can be considered proprietary or copyrightable.
Provenance and code tracking. Before AI, tools like git blame made it easy to track who wrote each line of code and why changes were made. With AI generating large amounts of code, it becomes more difficult for developers to remember the context and reasoning behind every line. Being able to detect and track AI-generated code helps with code maintenance, debugging, and resource management. CTOs and engineering leaders can use this information to evaluate the effectiveness of different AI models and ensure their teams are using the best tools available.
Overall, Pangram is able to conservatively detect most AI-generated code, especially when the code is over 40 lines long. Pangram is conservative because it rarely flags human-written code as AI-generated, but misses about 8% of AI-generated code, falsely predicting it as human.
When looking at all code snippets, Pangram misses about 20% of AI-generated code, because most short AI code snippets are boilerplate that is indistinguishable from human code or just do not have enough signal to be detected.
Metric | Score |
---|---|
Accuracy | 96.2% (22,128/22,997) |
False Positive Rate | 0.3% (39/13,178) |
False Negative Rate | 8.5% (830/9,819) |
Metric | Score |
---|---|
Accuracy | 89.4% (41,395/46,319) |
False Positive Rate | 0.4% (99/25,652) |
False Negative Rate | 23.3% (4,825/20,667) |
We use the GitHub dataset to perform this analysis. For the AI code, we use a simple two-stage synthetic mirroring stage:
We use GPT-4o, Claude Sonnet, Llama 405b, Mistral 7B, Gemini 1.5 Flash, and Gemini 1.5 Pro to create the dataset.
AI-generated code is more difficult to detect than AI-generated writing because there are significantly less degrees of freedom: there are less arbitrary stylistic choices that a programmer makes as compared to a writer. We notice in the false negatives that we observe, many files simply do not have much room for creativity or flexibility, such as boilerplate auto-generated code or configuration files. Low-level languages, such as C, Assembly, and compiler code, also are much more strict in their syntax and so there are less signals to be able to tell when code is AI-generated.
If you are looking for signs of AI-generated code, we recommend the following: