The Hidden Risks of Modern Code: Security Patterns Modern Tools Still Miss
The AI coding revolution is already here. Developers are shipping code faster than ever, powered by tools like GitHub Copilot, Cursor, Claude, and ChatGPT. Cycle times are shrinking. Merge velocity is up. Productivity metrics look great, and engineering leaders are celebrating.
But there’s a problem few want to acknowledge.
Over the past several months, we analyzed 45 open-source projects and uncovered 225 security vulnerabilities. Maintainers accepted 90.24% of our findings. The pattern was hard to ignore: infrastructure projects built for LLM applications contained the highest concentration of issues.
Projects like Langfuse, LiteLLM, ChromaDB, Ollama, and Milvus, the backbone of today’s AI application ecosystem, all exposed serious security weaknesses that traditional SAST tools failed to detect.
This isn’t random. Modern development workflows are introducing vulnerabilities that follow repeatable, predictable patterns, patterns that existing security tooling simply wasn’t designed to catch.
The Size of Code Made by AI
The 2025 Octoverse report from GitHub said that 40% of all code merged on the site now uses AI to help. Not just suggestions for autocompleting, but whole functions, classes, and features are made all at once.
Metric | Value |
|---|---|
Code on GitHub using AI assistance | 40% |
For many teams, the ratio is even higher. Junior developers increasingly rely on Copilot. Proof-of-concept code gets pushed forward with minimal review. AI-generated code often carries subtle structural patterns that traditional security tools were never designed to detect.
So what’s happening?
We’re seeing the emergence of a new class of vulnerabilities, spreading faster than most teams can identify, and evolving in ways existing tooling struggles to understand.
The Security Patterns AI Gets Wrong
After analyzing hundreds of code samples in our research, we've found five security anti-patterns that keep coming up:
1. Shortcuts for Validation
LLMs look for "working" code, not code that is safe. They cut corners when they make input validation:
A common example is URL validation that only checks if a URL is parseable, without blocking private IP ranges. Copilot knows how to check a URL. It doesn't know why you need to block internal IP ranges. Result: SSRF flaws in production.
2. Defaults that aren't safe
LLMs learn from codebases that are open source. A lot of those codebases put ease of use ahead of safety:
This isn't a new mistake. The model learnt this pattern from thousands of examples where developers used math/rand because it was easier. LLMs reliably copy these patterns.
Projects with public disclosures: Cloudreve
3. Missing Async/Await
We were surprised by this one. AI-generated async code often doesn't have the await keywords it needs. This is a mistake that makes the code less secure and hard to find during code review:
The function seems safe. It calls auth_request(). Without await, though, the authorisation check never runs.
4. Loss of context when copying and pasting
LLMs write code on their own. They don't get the security context of your app:
The model made authentication middleware in the wrong part of the lifecycle. The auth check was there; it just didn't run before the request was handled. Pattern matching tools found verifyAuth and went on.
5. Blindness to Race Conditions
LLMs don't know what concurrent execution is. AI could have made every race condition we found:
The logic seems to be right. Check to see if it has already been approved, then begin the transaction. But both requests can pass the check at the same time before either one starts the transaction. LLMs don't take into account concurrency.
Projects with public disclosures: Cloudreve, Phase
Why Code Review Doesn't Work
Why doesn't code review find predictable flaws in code that was made by AI?
Speed of generation: When Copilot makes 100 lines in a few seconds, the way you think about reviewing changes from "check everything" to "look for clear problems." The problems that aren't obvious get through.
Reviewer fatigue: Developers who look over AI-assisted PRs say they are biased. "The AI wrote it, so it probably knows how to handle edge cases." This is the exact opposite of what you want.
Limitations of pattern matching: Just like SAST tools, human reviewers pattern-match. It looks like verifyAuth() is an auth check. It's not as scary that it's in onResponse instead of onRequest.
Quality is less important than quantity: More code gets written when it's cheap to make code. More code means more work to review. Quality goes down.
What We Found in Our Scans
We found vulnerabilities across modern development projects in our research. These projects are going quickly. They send out new features every week. And they have the density of vulnerabilities to prove it.
The pattern doesn't say, AI-generated code is bad. It says fast-moving projects need security automation that can keep up.
The Traditional Tool Failure
The truth is that traditional SAST tools weren't made to work with AI-generated code.
Pattern matching works when vulnerabilities have a syntax that is easy to guess:
But AI-made weaknesses are semantic:
The auth check is there but never runs
The validation is there but not complete
The code is correct in terms of syntax, but not in terms of logic
We found 222 problems when we scanned NocoDB with Semgrep. There were 208 false positives. It completely missed the important SQL injection in the Oracle client because the flaw needed to know how data moved between different files.
That's an 87% false positive rate on the noise and a 100% miss rate on the important vulnerability.
What Semantic Analysis Does That Is Different
To find AI-generated vulnerabilities, you need to know what the code means, not just how it looks.
Cross-boundary data flow: We keep track of user input as it moves from the API endpoint to the database query, going through files, functions, and modules along the way. The SQL injection in one file that is linked to user input in another file can now be seen.
Lifecycle analysis: We know how framework lifecycle hooks work. The difference between onResponse and onRequestis more than just a string. It also changes when code runs in relation to processing requests.
Concurrency modelling: We look at which operations need to be done at the same time. Outside of a transaction, a check-then-act pattern makes a race window.
Context-aware filtering: Instead of marking everything that looks like a pattern, we check to see if the flagged code can be reached and used. This gets rid of false positives.
A Safe Workflow for AI-Assisted Development
AI-assisted coding isn't going to stop. This is how to use it safely:
1. Consider AI code to be untrusted input
The way you think about things is important. AI-generated code is a suggestion from a third party that doesn't know what your security needs are. Be careful when you look it over.
2. Automate Things That People Miss
People who review code miss bugs with async/await, lifecycle hooks, and race conditions. Automated semantic analysis always catches them. Put it in your CI/CD pipeline.
3. Check before merging, not after deploying
The least expensive vulnerability is one that is found before it ships. Do security scans on every PR. Stop merges that cause serious problems.
4. Pay attention to business logic
AI is great at boilerplate. It has trouble with business logic that is important for security. Authorisation flows, payment processing, and data validation should all be looked at more closely, even if AI wasn't involved.
5. Check the edges
Modern code often works perfectly on the happy path. Edge cases, like empty inputs, maximum values, and concurrent requests, show where the holes are. Add security edge cases to your test suites.
The Bottom Line
Modern development practices are here to stay. The benefits for productivity are real. But there are also security risks.
The weaknesses we found in 45 repositories weren't random. They followed patterns like validation shortcuts, insecure defaults, async mistakes, losing context, and race conditions. These patterns can be predicted and found.
This wasn't what traditional SAST tools were made for. They use syntax to match patterns. They don't understand code semantics, lifecycle hooks, or running code at the same time.
Semantic analysis is possible. It understands what code does, not just what it looks like. It finds the security holes that traditional tools miss.
225 vulnerabilities found. 45 repositories analyzed. 90.24% acceptance rate.
Modern code security requires tools that understand what code actually does.
Check Your Code
We found serious security holes in all of the major development projects we analyzed. Modern frameworks, popular libraries, and production codebases. All of them had problems that traditional SAST didn't find.
What is hiding in your code?
This study is based on our ongoing research into security in more than 45 open-source repositories. Please read our other post, "Why We Found 225 Vulnerabilities That SAST Missed in 45 Open Source Projects," for more information on our methods and full results.